[HN Gopher] alphaXiv: Open research discussion on top of arXiv
___________________________________________________________________
alphaXiv: Open research discussion on top of arXiv
Author : sahebjot
Score : 433 points
Date : 2024-09-08 06:57 UTC (16 hours ago)
(HTM) web link (www.alphaxiv.org)
(TXT) w3m dump (www.alphaxiv.org)
| karmakurtisaani wrote:
| I remember seeing this idea some years ago. I think it was called
| qrxiv.org or something like that, but can't find it anymore. I
| hope this one has better luck, getting the users in the
| fragmented space of preprints can be a challenge.
| fuglede_ wrote:
| There's also https://scirate.com/ which occasionally has active
| discussions but, at least in my field, there's far from
| critical mass, and discussions only happen when someone kick
| starts and advertises a thread.
| jessriedel wrote:
| I believe the most active field is quantum information, which
| has enough activity such that paper get dozens of upvotes,
| but the conversation level is basically as you describe.
| rsp1984 wrote:
| I launched gotit.pub [1] last year. It's very much the same
| thing.
|
| [1] https://gotit.pub
| abhayhegde wrote:
| Wow, how is this not getting enough attention when it is
| almost the same thing?!
| forgotpwd16 wrote:
| Because so are PubPeer and SciRate which exist for much
| longer. (And neither those are getting much attention
| either.)
| w-m wrote:
| Hey alphaxiv, you won't let me claim some of my preprints,
| because there's no match with the email address. Which there
| can't be, as we're only listing a generic first.last@org
| addresses in the papers. Tried the claiming process twice,
| nothing happened. Not all papers are on Orcid, so that doesn't
| help.
|
| I think it'll be hard growing a discussion platform, if there's
| barriers of entry like that to even populate your profile.
| phreeza wrote:
| How would you propose making claiming possible without the risk
| of hijacking/misrepresentation?
| w-m wrote:
| The data on which authors are part of which arxiv papers is
| already in the arXiv database, and in Google Scholar, and in
| other libraries. I appreciate that it's not an easy task to
| get that as a third party. But the burden should be on the
| operators of alphaxiv to figure out a solution for this
| platform to take off, not for me as a user?
| phreeza wrote:
| Yea I agree it shouldn't be on you as a customer, was more
| asking out of curiosity.
|
| I don't think Google scholar has this fully solved either,
| I've seen many misattributed papers there.
| supriyo-biswas wrote:
| The only way I see this working is for paper authors to
| include their public keys in the paper; preferably as
| metadata and have them produce a signed message using their
| private key which allows them to claim the paper.
|
| While the grandparent is understandably disappointed with the
| current implementation, relying on emails was always doomed
| from the start.
| michaelmior wrote:
| Given that the paper would have be changed regardless,
| including the full email address is a relatively easy
| solution. ORCID is probably easier than requiring public
| keys and a lot of journals already require them.
| westurner wrote:
| W3D Decentralized Identifiers are designed for this use
| case.
| riedel wrote:
| The claiming was 'solved' and ORCID, which both basically do
| no checking at all. It's just a yes/no clicking for fuzzy
| matched author name lists. So I guess it is enough until
| there is a dispute. If you are important enough to be the
| target of trolls than you are in a league beyond most
| research platforms.
| abhayhegde wrote:
| Perhaps by linking to their actual arxiv id?
| xyst wrote:
| There should be an equivalent of S/MIME for researchers if
| e-mail is not accessible.
| Y_Y wrote:
| So you've put a fake email address on your papers? As in, one
| that you can't receive from? Why?
| chipdart wrote:
| > So you've put a fake email address on your papers?
|
| I think you're failing to understand the basics of the
| problem, and even the whole problem domain.
|
| Email addresses are not created/maintained for life. You can
| have an email address, them have your org change name and
| email provider switch, and not to mention that researchers
| leave research institutions and thus lose access to their
| accounts.
|
| You have multiple scenarios where papers can be published
| with authors using email addresses which they lose access to.
| gexaha wrote:
| > You have multiple scenarios where papers can be published
| with authors using email addresses which they lose access
| to.
|
| Btw, why is it considered normal? I think it would be much
| better to mention an e-mail, to which you will have (more-
| or-less) permament access.
| dleeftink wrote:
| Security and affiliation purposes mostly.
| chipdart wrote:
| > Btw, why is it considered normal?
|
| What leads you to believe it isn't normal? I mean, do you
| have an eternal email address? Have you ever switched
| jobs?
|
| Most papers are authored/co-authored by graduate
| students. Do you think all of them will hold onto their
| institutional address after they graduate? A big chunk of
| them will not even continue in the field.
| atoav wrote:
| Why would you expect any institution to support all email
| addresses of their ex-employees ad infinitum?
|
| This would be a security nightmare for them. It is pretty
| normal for universities to have some sort of identity
| managmemt system that automatically provisions emails
| when you are employed there and deprovision them once you
| are gone.
| acka wrote:
| Why not have a system where students and staff have
| actual email inboxes but alumni have their email
| forwarded?
|
| Most universities use a portal of some sort for easy
| access to personal information and preferences anyway, so
| it shouldn't be too difficult to limit access for alumni
| to only allow them to change a few personal details like
| name / address / phone number and the like, plus email
| forwarding settings. I think the extra cost is negligible
| compared to what universities already spend on alumni
| like newsletters, conferences, dinners, etc.
| blackenedgem wrote:
| An awful lot of free student access programs revolve
| around the uni email address being accredited. Foe
| example Jetbrains will give you a full version of their
| products if you register with a uni email, then require
| you to verify it yearly.
|
| If you forward emails automatically then you'd lose this
| accreditation. I suppose the solution would be an
| accreditatiom domain that forwards to your uni address
| only, but that's extra work now.
| oefrha wrote:
| If I run a university IT system I certainly don't want
| someone who possibly attended a program thirty years ago
| walking around with an apparent affiliation with my
| institution. I find my institutions' policies of (IIRC)
| one year forwarding + permanent alumni email pretty
| reasonable.
|
| Additionally, making people who want to cold email work a
| little to acquire the current email address is actually a
| good thing, especially if they want to talk about
| something years old. I've generally had a lot more
| pleasant and engaging correspondence with people who
| worked out my email (say from a side project I develop
| pseudonymously) than ones who directly lifted my email
| from my professional profiles. So, expiring emails in
| papers generally isn't a real problem anyway, and it's
| basically never a hurdle if your target is still in
| academic circles. It only becomes a problem in this
| specific context of automated authentication (based on
| something not intended for that purpose).
| atoav wrote:
| I can't answer for everybody, but my (German) university
| is prohibited from doing so by law. We are state
| employees and as such our university needs a comtract
| with the people runnimg services that process our (or our
| students) data.
|
| Obviously our university isn't gonna make a 10kEUR/month
| contract just because some prof wants their mail
| forwarded to gmail. Especially not if they are not
| working here anymore.
| znpy wrote:
| There's nothing permanent in life.
|
| Dumb example: you might have published a paper while
| working at a company, but years later the company went
| bankrupt and ceased to operate. Now somebody else is
| owning the domains and they will not make you the favour
| to give you an email address.
|
| Notable example: Sun Microsystems. But there are many
| more, of course.
|
| Or you just moved from one university to another. Or you
| published while on grad school and then moved somewhere
| else.
| MereInterest wrote:
| Here's an example. I have a firstname.lastname@gmail.com
| address, which was intended to be permanent. Google
| turned on two-factor authentication, despite not having a
| second form of authentication available. Instead, they
| required the recovery address for 2FA. The recovery
| address was another Gmail address, which I haven't used
| since 2010, and which also had 2FA turned on using its
| recovery address. That was an SBCGlobal address, a
| company which has long since been purchased by AT&T, and
| the email address is entirely defunct.
|
| I place the blame here entirely on Google for misusing
| forms of identification. Two-factor authentication is
| having two locks on the same door, where recovery
| addresses are having two doors with separate locks. Using
| a recovery address for 2FA is absurd, and caused me to be
| locked out of my permanent email address.
| epanchin wrote:
| "I place the blame on Google because I didn't update my
| recovery address to one that worked"
| MereInterest wrote:
| First, recovery addresses are for recovery when access
| has been lost. They are an alternate method of entry when
| the primary method of entry has been lost. They are _NOT_
| an extra method of validation to be used for the primary
| method of entry.
|
| When Google switched from offering 2FA to requiring 2FA,
| it would have been acceptable for them to require a
| second form of authentication to be added on the next
| log-in. It is not acceptable for Google to pretend that
| they have a second form of authentication when they do
| not.
|
| Second, up until the moment it was needed, I had access
| to my recovery address. Google locked me out of my
| primary address _and_ my recovery address simultaneously.
| QuadmasterXLII wrote:
| Did you notice that the issue was that O0P had failed to
| update the recovery address of their recovery address,
| and google removed access to both the main email and the
| recovery email at the same time?
| msteffen wrote:
| That is just not always possible. An example that should
| be familiar to HN: I worked for a period at startup, and
| used my email at that startup (my only work email at the
| time, as that's where I was working!). Then the startup
| ran out of money money and was sold. Hence the email no
| longer worked.
|
| Should I have waited until the startup had more revenue?
| We were profitable at the time (we were B2B and the
| layoffs did us in)
| qwertox wrote:
| > Email addresses are not created/maintained for life
|
| Then don't pretend that it is an email address.
|
| I mean, it's true that email addresses are not guaranteed
| to be assigned for life, but putting a fake email address
| on a paper is misleading.
| chipdart wrote:
| > Then don't pretend that it is an email address.
|
| I think you don't know what a email address is, and how
| they are used.
|
| > (...) but putting a fake email address(...)
|
| This nonsense of "fake email address" was only brought up
| as a baseless accusation. There is zero substance to it,
| and it's been used as a red herring in this discussion.
|
| Focus on the problem: do you expect any and all email
| addresses you publish somewhere years ago to continue to
| work?
| qwertox wrote:
| > [...] you won't let me claim some of my preprints,
| because there's no match with the email address. Which
| there can't be, _as we're only listing a generic
| first.last@org addresses in the papers_.
|
| I understood it this way: org is not handing out
| first.last@org to the employee, but using an email format
| in order to clarify that "first last" is working at org
| and collaborated on the paper not in private, but as an
| employee.
|
| He might have last.f@org gotten assigned as a valid email
| address from the org, but that one is not being used on
| the paper, while first.last@org is invalid.
|
| > I think you don't know what a email address is, and how
| they are used.
|
| You should know that this kind of comment should not be
| made on HN, see the guidelines [0] ("Be kind. Don't be
| snarky. Converse curiously; don't cross-examine. Edit out
| swipes.").
|
| > do you expect any and all email addresses you publish
| somewhere years ago to continue to work?
|
| No. But that is irrelevant to this conversation.
|
| [0] https://news.ycombinator.com/newsguidelines.html
| Sophira wrote:
| Let's say that John Smith at XYZ Corp has authored a
| paper. The company obviously wants recognition and so
| they use their corporate email address "jsmith@xyz.com".
|
| John has since moved on and is earning more at ABC Corp
| instead. XYZ Corp has duly reclaimed John's old email
| address, and John cannot receive emails at said address
| any longer.
|
| This is the situation the OP is in. It was never a "fake
| email address". They did not _literally_ type
| "first.last@org", that was an example suitable for using
| in their comment.
|
| [edit: I'm actually wrong with that last statement, as it
| turns out. While it wasn't a fake email address, the
| situation is slightly more nuanced in that OP actually
| _did_ say "{first}.{last}@hhi.fraunhofer.de" in the
| paper, as there were multiple authors who all had the
| same email address format - see
| https://news.ycombinator.com/item?id=41479618. I still
| think this is a valid method, though, and it's certainly
| not fake. Besides, the problem I outlined sounds like it
| probably remains an issue even if it's not the exact
| problem OP is experiencing.]
| qwertox wrote:
| Ok, so they used a template on the paper, namely
| "{first}.{last}@hhi.fraunhofer.de", while the email
| addresses, if the names are applied to the template, do
| in fact yield valid email addresses.
|
| It sounded as if they were using
| "john.doe@hhi.fraunhofer.de" while in reality it was an
| invalid email address (" _because there's no match with
| the email address_ "), that he would have tried to claim
| co-authorship via his "real" address, which might be
| something like "j.doe2@hhi.fraunhofer.de" (but luckily is
| not).
|
| It's all clear now. Thank you for your explanation.
| Y_Y wrote:
| This is what I was asking about and I thank you and GP
| for clarifying the situation. There also send to be an
| unnecessary flamewar about the impermanence of email
| addresses generally, that's an unfortunate accident.
| Y_Y wrote:
| I infer that you interpreted this question as an attack, or
| at least some sort of criticism. None was meant, I really
| just wanted to know if the email adress as written in the
| document was deliberately invalid or not.
| creer wrote:
| You were declaring the address to be "fake". Presumed
| facts not in evidence.
| Y_Y wrote:
| I wasn't declaring anything! It was a question, which is
| why it ended with a question mark. It's a totally
| standard construction in English, and would probably
| include a rising tone if spoken.
|
| I cannot understand how what's written there could have
| been confidently construed as a statement.
| fragmede wrote:
| because you used the word fake in an accusatory tone.
| that seems to not be what you intended, but that seems
| more on you to word it better than to expect all readers
| to interpret your words differently.
| limit499karma wrote:
| Conflating email addresses with identity in the digital age
| is a global techical debt.
| crvdgc wrote:
| Preprints are not required to be fully typeset and
| publishable. In some cases, literally "first.last@org" is
| used as a placeholder for email addresses (to be replaced in
| typesetting).
|
| This is more like a mismatch between "fully edited open-acess
| papers" and "trying to use arxiv preprints as an
| approximation of the former".
| Y_Y wrote:
| For the record, at least in my field Arxiv is where the
| action happens and journals are an afterthought. I don't
| put placeholders for contact details in my preprints
| because 1) the adresses likely won't change between drafts
| and 2) lots of readers are going to be reading that version
| so I want them to have access to the real info.
|
| Of course most of that is moot for professional scientists
| because you likely know (or at least be able to find out
| about) the authors already. For example some papers have
| old non-working email addresses for the authors who have
| since moved institution. It's not a problem, since I'll
| just look them up by name if necessary and usetheir current
| email.
| aragilar wrote:
| No, they used the email from the institute they worked at
| when the produced the paper.
|
| They're no longer at that institute, and that email no longer
| exists (while some institutions give some leeway, I know of
| at least one major university which removes them the day the
| contract ends).
|
| This is a common problem if you're providing services to
| academics and you've tied yourself to using emails as
| identifiers.
| abhayhegde wrote:
| Does not have to be fake or anything. You move from one
| institution to the other and cannot maintain it forever
| anyway.
| auggierose wrote:
| Upload a new version of your paper on arxiv, this time with an
| email address that works.
| Sophira wrote:
| Why should they need to? Their email address _did_ work at
| the time of publication.
| auggierose wrote:
| They don't have to. But then they cannot claim the paper.
|
| It is a good idea in general to make sure that your papers
| contain up-to-date contact information. One way of doing
| this is to use an orc-id.
| azhenley wrote:
| I've never understood why we need emails in papers.
|
| Who sends emails to paper authors? How often do they respond?
| How fast do the email addresses go out of date? I lost access
| to my email address included in most my papers within 2 years
| of publication.
|
| I see little to no value to have it included in the paper.
| Maken wrote:
| I do email paper authors and I do respond to requests and
| inquiries about my own papers. Even if you don't work at
| the same institution any longer, most universities let you
| redirect your email for many years after you left.
|
| Also, I don't think we are yet at the point when
| human2human communication is not possible.
| azhenley wrote:
| You don't need emails in archival PDFs for human-to-human
| communication.
| xyst wrote:
| > Who sends emails to paper authors?
|
| I do when the paper is not easily available or the
| publisher charges some outrageous fee (have seen $50 for a
| paper in the past).
|
| Authors typically despise the publishers and are happy to
| share their work to anybody interested.
| azhenley wrote:
| For sure. That is why I keep the preprint PDFs on my
| website (along with my current email address).
| CamperBob2 wrote:
| _Who sends emails to paper authors?_
|
| I do, when I'd like to read a paper that's locked behind a
| paywall and not available on sci-hub. Authors of scientific
| papers are much like any other authors... they want to be
| read. The more enlightened among them understand that
| obscurity is a problem rather than a perk. They also tend
| to appreciate engagement in the form of follow-up questions
| (at least from people who actually read the paper.)
|
| Obviously it's not a major concern on arxiv, but in a
| larger historical sense, this type of communication was a
| key original application of email.
| azhenley wrote:
| If an author wants to be read then they will keep the
| preprint PDFs on their website (along with their current
| email address). An added benefit is that Google Scholar
| indexes and links directly to the PDFs instead of the
| publisher website.
| tc4v wrote:
| I know you don't have a lifetime access to institutional email
| adress, but using a fake address is so counterproductive.
| You're only going to claim the paper once, and yuh ou should do
| it while you have access to your email. Then you update your
| account eith a new address.
| Sophira wrote:
| Let's say that John Smith at XYZ Corp has authored a paper.
| The company obviously wants recognition and so they use their
| corporate email address "jsmith@xyz.com".
|
| John has since moved on and is earning more at ABC Corp
| instead. XYZ Corp has duly reclaimed John's old email
| address, and John cannot receive emails at said address any
| longer.
|
| This is the situation the OP is in. It was never a "fake
| email address". They did not _literally_ type
| "first.last@org", that was an example suitable for using in
| their comment.
|
| [edit: I'm actually wrong with that last statement, as it
| turns out. While it wasn't a fake email address, the
| situation is slightly more nuanced in that OP actually _did_
| say "{first}.{last}@hhi.fraunhofer.de" in the paper, as
| there were multiple authors who all had the same email
| address format - see
| https://news.ycombinator.com/item?id=41479618. I still think
| this is a valid method, though, and it's certainly not fake.
| Besides, the problem I outlined sounds like it probably
| remains an issue even if it's not the exact problem OP is
| experiencing.]
| w-m wrote:
| OP here, what I'm actually using is
| "{first}.{last}@hhi.fraunhofer.de"
| (https://arxiv.org/pdf/2312.13299). I see how my earlier
| comment was confusing.
|
| In our case it's for saving space in the paper, and also
| for reducing spam. This small change may now seem silly in
| the age of LLMs, but the papers that have full email
| addresses in them get a considerable amount of fake
| conference and journal participation emails, which is
| annoying.
| Sophira wrote:
| Oh, I see - the situation's more nuanced than I thought,
| then. My apologies.
|
| I still think this is valid (and certainly not the fake
| email address that people are calling it), but yeah, it's
| not what I thought it was.
| rehaanahmad wrote:
| Thanks for reaching out, I am one of the students working on
| this. We are adding google scholar support soon. If your paper
| isn't on Scholar or ORCID, you will need to submit a claim that
| our team reviews. There isn't really any other option, arXiv
| doesn't allow us to view the author's submission email
| automatically (although we are in the process of becoming an
| arXiv labs project soon).
| llmfan wrote:
| hnews tries to say one positive thing challenge impossible
|
| i always love any idea for curating high iq internet community
| noobermin wrote:
| There are only 4 comments so far. It seems a bit early to judge
| the comment section.
| BaculumMeumEst wrote:
| I think curating an internet community that is open-minded and
| participates in good faith is extremely hard as well. Not sure
| which is harder.
| stevage wrote:
| Yours was the first negative comment I saw...?
| llmfan wrote:
| That was the intention behind my comment! No need to thank me
| tho.
| noobermin wrote:
| This seems like a horrible idea. I know we need an alternative to
| peer review but an online comment section feels like something
| worse than that.
| geysersam wrote:
| Why do you feel that? I think it seems excellent. Comments will
| be scrutinized and sorted by the community. People can choose
| if they want to participate in the discussion under their own
| names or not.
|
| I don't know how this page organizes moderation, but I imagine
| there will be some kind of moderation like on most online
| discussion forums.
| tc4v wrote:
| But it's a very difficult problem. Am open forum offers a
| platform to troll and misinformation. You could pretend that
| the community will be able to filter this out but I seriously
| doubt this project is equipt to fight against bots/fake
| accounts better than huge companies like twitter and
| facebook.
| parpfish wrote:
| Agreed. Journals have a hard time getting quality reviews
| from known scientists working in the field. Opening it up
| to randoms will be a nightmare.
|
| Remember all that hype around LK-99 room temp
| superconduction a few months back? The substantive
| scientific discussion would absolutely be drown out by
| curious laymen and/or grifters
| geysersam wrote:
| That can be fixed. Just optionally filter the comments to
| only include sufficiently "reputable" sources
| (potentially people with status in the community or
| people you follow explicitly).
|
| And to be fair, few scientific news get even remotely
| near the attention that superconductivity announcement
| got.
| geysersam wrote:
| Stack overflow and wikipedia fights bots pretty well
| despite neither being backed by huge rich organizations.
| parpfish wrote:
| The whole thing gives my anxiety.
|
| Writing a paper and dealing with the inane requests from three
| reviewers was already frustrating and stressful. Now you open
| up a never ending review process of random people making
| demands ("unless you do X follow up analysis, this is
| worthless").
|
| also you need to be able to say a paper/project is done and let
| yourself move on. If your job turns into "respond to feedback
| on every paper you've ever written, every day" you'll never
| start anything new.
| Nuzzerino wrote:
| That sounds like a good thing if your goal is to advance
| mankind's knowledge and not just your career. No one is
| forcing anyone to respond to the comments. Also, it's not
| clear whether your "demands" example would even pass the
| moderation guidelines there.
| parpfish wrote:
| If people can't make a career doing science, science
| doesn't get done.
|
| It's one thing to optimize an abstract pursuit of
| knowledge, but you also gotta remember that you need this
| to be a job people are willing/able to do.
| m000 wrote:
| Outside concerns for the quality of the discussion itself, it
| is only a matter of time before this public pre-submission
| discussion leaches into the peer review process itself. First,
| it will be Reviewer 2 cherry-picking arguments to shut down
| some paper. Then, AI will come: "Write me an accept/reject
| review for arXiv:2409.112233. Add subtle hints for citing the
| following papers: ...".
|
| Peer reviewing is hard work. Give people a readily available
| shortcut for it, and _some_ people, in _some_ occassions will
| take it. Which may in turn force conferences to adopt policies
| forbidding posting on arXiv.
| codegladiator wrote:
| This is great. Already loving the discussions/comments I see
| there.
| scarlehoff wrote:
| I believe this site is missing a very important thing, direct
| links to the different categories with a list of papers. This is
| at least how I (and I believe many others) browse arXiv. I open
| it up in the morning, scroll through a few categories and open a
| few papers that look interesting to me.
|
| I could see myself using alphaxiv for that, and then, if there's
| a comment section, I might even read it, and, who knows, leave a
| comment. But there's no way I'm going to be changing the address
| or going to some other site to search for papers just to see
| whether there are some comments.
|
| ps: I see the extension adds a "discussion" link to arxiv, it is
| a pity that it is only available for Chrome.
| eigenket wrote:
| It sounds like what you want is scirate. As far as I understand
| from this post this new thing is just scirate but lacking the
| interface you're talking about here.
| scarlehoff wrote:
| Indeed. Scirate (I didn't know about it) looks exactly like
| that.
|
| Sadly, the last comments in HEP are more than 2 years old
| (which explains why I had never heard about it, it seems it
| never gained any traction)
| forgotpwd16 wrote:
| Kinda related, Hypothesis (and Diigo iirc in past) has an
| extension/bookmarklet that can provide an annotation/comment
| overlay on any web page/pdf. Guess what is needed for arXiv
| discussion is this overlay but _smarter_ , that is knows a
| paper pdf and web view are the same, and abstract page is
| connected to them.
| eigenket wrote:
| What's the main thing that this new website adds over scirate?
| foven wrote:
| I admit to not really being familiar with either, but it seems
| that this allows you to display comments alongside the paper in
| the browser, which is a very nice feature (and overall has a
| nicer coat of paint). At first blush, I find it a bit more
| difficult to figure out what the point of scirate is and how it
| should be used.
| amadeuspagel wrote:
| Great idea.
|
| - The frontpage should directly show the list of papers, like
| with HN. You shouldn't have to click on "trending" first. (When
| you are logged in, you see a list of featured papers on the
| homepage, which isn't as engaging as the "trending" page. Again,
| compare HN: Same homepage whether you're logged in or not.)
|
| - Ranking shouldn't be based on comment activity, which ranks
| controversial papers, rather papers should be voted on like
| comments.
|
| - It's slightly confusing that usernames allow spaces. It will
| also make it harder to implement some kind of @ functionality in
| the comments.
|
| - Use HTML rather then PDF. Something that could be trivial with
| HTML, like clicking on an image to show a bigger version,
| requires you to awkwardly zoom in with PDF. With HTML, you would
| also have one column, which would fit better with the split
| paper/comments view.
| sestep wrote:
| Tiny note: Stack Exchange also allows spaces in display names,
| and they make @ functionality work regardless:
| https://meta.stackexchange.com/a/43020/297476
|
| Agreed that it makes it more complicated though.
| throw_pm23 wrote:
| Counterpoint: please don't do any of the above and keep arxiv
| as it is. It is too valuable to mess it up, it is the few
| things on the internet that have not been ruined yet, and the
| "comment activity" can happen in the articles themselves at the
| scale of years, decades, and centuries.
| Epa095 wrote:
| This seems to be a completely different team than arxiv,
| making a discussion forum on the side.
|
| And I prefer this over discussions on 'X'.
| diggan wrote:
| > - Ranking shouldn't be based on comment activity, which ranks
| controversial papers, rather papers should be voted on like
| comments.
|
| How about not ranking things at all? I don't feel like things
| like this should be a popularity/"like" contest and instead let
| the content of the paper/comments speak for themselves. Yes,
| there will be some chaff to sort through when reading, but
| humanity will manage.
|
| Just sort things by updated/created/timestamp and all the
| content will be equal.
| thornewolf wrote:
| thats ranking by recency, which means i can abuse it by
| churning low quality content out to arXive
| pessimizer wrote:
| > let the content of the paper/comments speak for themselves.
|
| People can't read everything, and have rely on others to
| filter up the good stuff. If you read something random, based
| on no recommendation, it's charity work (the odds are
| extremely good that it is bad) and you should recommend that
| thing to other people if it turns out to be useful.
| Ultimately, that's the entire point of any of this design: if
| we don't care about any of the metadata on the papers, they
| could just be numbered text files at an ftp site.
|
| The fewer things I have to read to find out they're shit, the
| longer life I have.
|
| I say the opposite: put a lot of thought into how papers are
| organized and categorized, how comments on papers are
| organized and categorized, the means through which papers can
| be suggested to users who may be interested in them, and the
| methods by which users can inject their opinions and comment
| into those processes. Figure out how to thwart ways this
| process can be gamed.
|
| Treat the content equally, don't force the content to be
| equal. Hacker News shouldn't just be the unfiltered _new_
| page.
| gus_massa wrote:
| Sorted by "new"...
|
| Most articles are not interesting, most of the interesting
| ones are interesting only for a niche of a few researchers.
| The front page will be flowed by uninteresting stuff.
| Retr0id wrote:
| > rather papers should be voted on like comments.
|
| I don't think this is an inherently better approach, but maybe
| there should be an option for different ranking mechanisms. You
| could also rank by things like cite-frequency, cite-recency,
| "cite pagerank", etc.
| throwthrowuknow wrote:
| Agree, don't sink a bunch of effort into creating a ranking
| algorithm. Expose metrics that users can sort or filter by
| which will work for both signed in and signed out. If you
| want to add more tools for signed in users then let them
| define their own filters that they can save like comment
| activity plus weighted by author, commenter, recency, topic
| etc. See the nntp discussion that was on here the other day.
| dartos wrote:
| Yep. User driven ranking leads to people gaming the system
| for internet points.
| anamexis wrote:
| It doesn't seem like citations would be good for discovery,
| because there must be a significant latency between when a
| paper is released and when citations start coming in.
| bee_rider wrote:
| Probably it would be best to just get a site on the web and
| expose a bunch of different metrics so people can sort by
| whatever.
|
| Citations are probably not the best metric for discovery,
| but also this really just makes me wonder if papers are not
| the best thing for discovery. An academic produces ideas,
| not papers, those are just a side-effect. The path is
| something like:
|
| * make a idea
|
| * write short conference papers about it
|
| * present it in conferences
|
| * write journal papers about it
|
| * maybe somebody writes a thesis about it
|
| (Talking to people about it throughout).
|
| If we want to discover ideas as they are being worked on, I
| guess we'd want some proxy that captures whether all that
| stuff is progressing, and if anybody has noticed...
|
| Finding that proxy seems incredibly difficult, maybe
| impossible.
| michaelmior wrote:
| I'm not sure I agree about papers just being a side
| effect. An idea by itself has significantly less value
| than an idea which has been clearly documented and
| evaluated. I think a paper is often still the best way to
| do this.
| impendia wrote:
| > Use HTML rather then PDF.
|
| The PDF is the original paper, as it appears on arXiv, so using
| PDF is natural.
|
| In general academics prefer PDF to HTML. In part, this is just
| because our tooling produces PDFs, so this is easiest. But
| also, we tend to prefer that the formatting be semi-canonical,
| so that "the bottom of page 7" or "three lines after Theorem
| 1.2" are meaningful things to say and ask questions about.
|
| That said, the arXiv is rolling out an experimental LaTeX-to-
| HTML converter for those who prefer HTML, for those who usually
| prefer PDF but may be just browsing on their phone at the time,
| or for those who have accessibility issues with PDFs. I just
| checked this out for one of my own papers; it is not perfect,
| but it is pretty good, especially given that I did absolutely
| nothing to ensure that our work would look good in this format:
|
| https://arxiv.org/html/2404.00541v1
|
| So it looks like we're converging towards having the best of
| both worlds.
| ethanol-brain wrote:
| > That said, the arXiv is rolling out an experimental LaTeX-
| to-HTML
|
| Some history: https://www.arxiv-vanity.com/
| throw10920 wrote:
| > In general academics prefer PDF to HTML. In part, this is
| just because our tooling produces PDFs, so this is easiest.
|
| The tooling producing PDF by default absolutely makes the
| preference for PDF justifiable. However, tooling is driven by
| usage - if more papers come with rendered HTML (e.g. through
| Pandoc if necessary), and people start preferring to consume
| HTML, then tooling support for HTML will improve.
|
| > But also, we tend to prefer that the formatting be semi-
| canonical, so that "the bottom of page 7" or "three lines
| after Theorem 1.2" are meaningful things to say and ask
| questions about.
|
| Couldn't you replace references like "the bottom of page 7"
| with others like "two sentences after theorem 1.2" that are
| layout-independent? This would also make it easier to rewrite
| parts of the paper without having to go back and fix all of
| your layout-dependent references when the layout shifts.
|
| HTML has strong advantages for both paper and electronic
| reading, so I think it's worth making an effort to adopt.
|
| When I print out a paper to take notes, the margins are
| usually too narrow for my note-taking, and I additionally
| have a preference for a narrow margin on one side and a wide
| margin on the other (on the same side, not alternating with
| page parity like a book), which virtually _no_ paper has in
| its PDF representation. When I read a paper electronically, I
| want to eliminate pagination and read the entire thing as a
| single long page. Both of these things are significantly
| easier to do with HTML than LaTeX (and, in the case of the
| "eliminate pagination" case, I've _never_ found a way to do
| it with LaTeX at all).
|
| (also, in general, HTML is just far more flexible and
| accessible than PDF for most people to modify to suit their
| preferences - I think most on HN would agree with that)
| michaelmior wrote:
| > Couldn't you replace references like "the bottom of page
| 7" with others like "two sentences after theorem 1.2" that
| are layout-independent?
|
| Yes, but I think such references are inherently harder to
| locate. Personally I try to just avoid making references to
| specific locations in the document and instead name
| anything that needs to be referenced (e.g. Figure 5,
| Theorem 3.2).
| throw10920 wrote:
| Yes, I absolutely agree - I just figured that there had
| to be a reason that someone would want to do that.
| Chesterton's Fence and whatnot.
| gwern wrote:
| I increasingly recommend against the Arxiv HTML version. I
| thought it had an acceptable start and they would fix the
| remaining problems and rapidly become on par with the PDF,
| but that seems to not be happening.
|
| The HTML version is _seriously_ buggy; and the worst part is,
| a lot of those bugs take the form of silently dropping or
| hiding content. It 's bad enough when half the paper is gone,
| because at least you notice that quickly, but it'll also do
| things like silently drop sections or figures, and you won't
| realize that until you hit a reference like 'as discussed in
| Section 3.1' and you wonder how you missed that. I filed like
| 25 bugs on their HTML pages, concentrating on the really big
| issues (minor typographic & styling issues are too legion to
| try to report), and AFAIK, not a single one has been fixed in
| a year+. Whatever resources they're devoting to it, it's
| apparently totally inadequate to the task.
| generationP wrote:
| I think development on the TeX-to-HTML compiler has slowed
| down at some point, and it's far from perfect yet. Some of
| the issues are probably HTML5 limitations, unlikely to be
| fixed any time soon (unless one wants formulas to become
| graphics).
|
| But there is another problem: It takes too long to load on
| mobile and doesn't reflow. I thought mobile was one of the
| reasons people wanted HTML in the first place!
| amadeuspagel wrote:
| In PDFs on arXiv, syntax highlighted codeblocks are
| graphics.
| gradus_ad wrote:
| > Ranking shouldn't be based on comment activity, which ranks
| controversial papers
|
| But don't we want people's attention drawn to
| controversial/conversation generating papers? The whole point
| of the platform is to drive conversation
| woodson wrote:
| The concern may be about what effect this will have on future
| papers (just like news headlines engineered for clickbait).
| runningmike wrote:
| Many people on earth have names with spaces. So good that a
| username can reflect a real name a person has.
| rehaanahmad wrote:
| Great idea, we'll look into making the home page the trending
| page soon.
|
| Regarding HTMl, our original site actually only supported HTML
| (because it was easier to build an annotator for an HTML page).
| the issue is that a good ~25% of these papers don't render
| properly which pisses off a lot of academics. Academics spend a
| lot of time making their papers look nice for PDF, so when
| someone comes along and refactors their entire paper in HTML,
| not everyone is a fan.
|
| That being said, I do think long term HTML makes a lot of sense
| for papers. It allows researchers to embed videos and other
| content (think, robotics papers!). At some point we do want to
| incorporate HTML papers back into the site (perhaps as a
| toggle).
| DoctorOetker wrote:
| I apologize for changing topic here:
|
| Did you bulk download the arxiv metadata, PDF and or LaTeX
| files?
|
| I am trying to figure out what the required space is for just
| the most recent version of the PDF's.
|
| I can find mentions of the total size in their S3 bucket but
| unclear if that also includes older versions of the PDF's.
|
| I also wonder if the Kaggle dataset is kept up to date since
| it states merely 1.7M articles instead of 2.4 I read
| elsewhere.
|
| Edit: I just found the answers to my question here:
| https://info.arxiv.org/help/bulk_data_s3.html
| ZeroSolstice wrote:
| > The frontpage should directly show the list of papers, like
| with HN.
|
| I disagree. There are numerous times where I have browsed the
| comments on a HN post where people haven't read the article and
| are just responding to the comment thread. The workflow for
| this seems a bit different in that a person would have already
| read a paper and wanted to read through existing discussions or
| respond to discussion. With that, having the search front and
| center would follow as the next steps for a person who read a
| paper and wanted to "search" for discussions related to that
| paper in particular.
|
| HN is more an aimless browsing which is a bit different than
| researching a specific area or topic.
| tinyhouse wrote:
| We obviously had this for many years with OpenReview, which has a
| different purpose, but having something for every paper is indeed
| needed. I have trouble opening some links, guessing it's still
| under heavy development. Looks nice!
| dangoodmanUT wrote:
| advisors seem very biased to ML, Google, and Stanford
| karencarits wrote:
| There is also https://pubpeer.com/
|
| I worry that fragmentation of this space might not be beneficial,
| so it would be nice if these services could collaborate in some
| way, perhaps using activitypub or something
| levocardia wrote:
| Agreed, pubpeer is already a very widely used platform in
| health and biology research. The PubPeer chrome extension is a
| must-have, in my mind, as it alerts you when a paper you find
| (even linked on some other website) has comments or has been
| retracted.
| cs702 wrote:
| How are the creators going to prevent gaming?
|
| I ask because every system I've ever come across for discussing
| and ranking content _without human moderation_ is always, sooner
| or later, gamed.
| rehaanahmad wrote:
| We have a team of enthusiastic reviewers/moderators in a couple
| sub-categories. We plan on growing this team out as the site
| continues to grow. If you'd like to be a reviewer:
| https://docs.google.com/forms/d/11ve-4cL0axTDcqnHF66zX6greFV...
| tuxguy wrote:
| awesome honking idea ! please add "spaces" for biorxiv and
| medrxiv too !
| sundarurfriend wrote:
| I wish for either:
|
| 1) Zoom buttons just for the paper - the article text is often
| tiny, and zooming in with the browser messes up the page layout
| and makes the page practically unusable.
|
| OR
|
| 2) A simple direct button to download the PDF directly. This
| would alleviate the zoom problem since I can view it in my local
| PDF reader with the best settings for me. Having to go to arxiv
| to download the PDF for every paper would be a nuisance over time
| though, so a button in the top bar would make the experience a
| lot better.
| AlexDragusin wrote:
| For me it always downloads the PDF, because I have disabled the
| View PDF in browser option (Toggle ON, on Edge: "Always
| download PDF files"), in browser settings, consider this as a
| solution.
|
| Edit: The above is applicable to arxiv itself, I got confused,
| the alphaxiv.org opens the PDF in a framed way with no option
| to download, indeed.
| rehaanahmad wrote:
| Zoom is in the works! We are adding this in the coming week!
| shayankh wrote:
| so cool!
| abhayhegde wrote:
| Great platform for invigorating research discussions! But seeing
| only AI based (or broadly CS based) research as featured papers
| is a bit discouraging. Perhaps there isn't enough critical mass
| for other topics yet.
| cgshep wrote:
| > Use HTML rather then PDF.
|
| Tenured prof here. Academics don't use HTML, despite its obvious
| advantages. The incentive system is deeply broken. No big-name
| journal or conference will accept a well-formatted HTML over
| their proprietary Latex/Word format. Latex to PDF converters
| generally suck.
| elashri wrote:
| Arxiv already provide a HTML version of the articles [1]. The
| authors does not have to provide HTML version, it is converted
| by arxiv. i.e [2]
|
| [1] https://info.arxiv.org/about/accessible_HTML.html
|
| [2] https://arxiv.org/html/2409.00838v1
| cgshep wrote:
| Tenured prof here. Every paper of mine goes on Arxiv with no
| exceptions, published under CC BY-NC-ND licenses. Some of us are
| working hard to overcome the system (e.g. look at the IACR's
| efforts). Unfortunately, academics are still hindered by
| institutional inertia; in fact, many prefer the status quo,
| usually those who rely on prestige over _actual_ quality to
| advance their careers.
| michaelmior wrote:
| > usually those who rely on prestige over actual quality to
| advance their careers
|
| Unfortunately for those of us pre-tenure, it's difficult to
| balance these as I'm sure you aware. We're evaluated by people
| who may have the best intentions, but don't work directly in
| our field. They then determine whether we keep our jobs. It's
| difficult not to consider prestige as a factor when you know
| those evaluating you will.
| Ar-Curunir wrote:
| What do you mean by the IACR's efforts here? In the crypto
| community it's very much the norm to put everything on eprint,
| and it is very rare to find a crypto paper not on there
| chipdart wrote:
| > (...) in fact, many prefer the status quo, usually those who
| rely on prestige over actual quality to advance their careers.
|
| Your comment doesn't read like one from anyone with any
| relationship with academia. If you had, you'd know that the
| issue is not a vacuous "prestige" but funding being dependent
| on hard metrics such as impact factor, and in some cases with
| metrics being collected exclusively from a set of established
| peer-reviewed journals that must be whitelisted.
|
| And ArXiv is not one of them.
|
| This means that a big share of academia has their professional
| and future, as well as their institution's ability to raise
| funding, dependent on them publishing on a small set of non-
| open peer-reviewed journals.
|
| Reading your post, you make it sound like anyone can just
| upload a random PDF to a random file server and call it a
| publication. That ain't it. If you fail to understand the
| problem, you certainly ain't the solution.
| dguest wrote:
| I all fairness, I don't think the grandparent post disagrees
| with anything that is in the parent post here.
|
| Yes, academia has tried to quantify prestige via impact
| factor and peer-reviewed journals. Yes, lots of people (even
| in Academia) feel that the system is being gamed, with by the
| publishing houses that own the journals being a common
| scapegoat.
|
| The system isn't broken, but it also keeps its integrity
| through some dynamic tension: a bit of criticism is a good
| thing.
| JadeNB wrote:
| > And ArXiv is not one of them.
|
| But putting your papers on the arXiv, as your parent said,
| doesn't mean you _only_ put them on the arXiv. I put all my
| papers on the arXiv, but I also submit them for publication
| in journals that will help me make the case for funding and
| promotion.
| BeetleB wrote:
| > Your comment doesn't read like one from anyone with any
| relationship with academia.
|
| Your comment reads likewise.
|
| He didn't say he publishes them exclusively on Arxiv. It's
| quite common for professors to post it there as well as
| submit to journals. Many (most?) journals allow for it - they
| don't insist the ones in arxiv be taken down - as long as
| they're posting preprints and not the final (copyrighted)
| version.
|
| As an academic, you should also know that practices vary
| widely with discipline. As an example:
|
| > dependent on them publishing on a small set of non-open
| peer-reviewed journals.
|
| IIRC, NIH grants _require_ publishing in _open_ peer-reviewed
| journals.
|
| Also, lots of disciplines are not heavily reliant on funding.
| In both universities I attended, the bulk of math professors
| did not even apply for grants! It's not required to get
| tenure (unlike engineering/physics). Also often true in some
| economics departments.
|
| As an aside, your comment violates a number of HN guidelines.
| parpfish wrote:
| > Tenured prof here.
|
| Yeah, but every pre-tenure or postdoc is like "I can't fight
| the system right now, I need to publish enough to still have a
| job two years from now"
| DoctorOetker wrote:
| helpful would be cheaper equipment and tools used in
| research, and unrestricted popular access to scientific
| literature
| gigatexal wrote:
| Thank you, thank you, thank you! I've no skin in the game (not
| an academic and a math idiot but I've a hole in my heart for
| Aaron Swartz and what he stood for) but I love that there are
| professors like yourself that believe in the free sharing of
| knowledge.
| gr__or wrote:
| I am very non-eager to help any further platform grow that has
| not been built on-top of sth like atproto (the BlueSky protocol),
| to prevent silos and the monopolist landlords that come with
| those.
|
| Great idea though, would love to use sth like this, if it existed
| on a federalized protocol.
| Nuzzerino wrote:
| Can't please everyone.
| chfritz wrote:
| Why limit this to arxiv papers? Why not any paper published
| online, e.g., via https://bibbase.org? btw, very cool that you
| seem to have overcome the initial inertia of getting something
| like this going. The idea is not new, but it's a marketplace
| dynamic that is hard to bootstrap.
| rehaanahmad wrote:
| One of the co-creators of this site. A lot of great suggestions
| I'm reading so far, a lot of them are currently in the works
| (zooming in/out, infra issues for slow loading times on some
| papers, google scholar claiming papers).
|
| For some more context, we are a group of 3 students with a
| background in AI research, and this site was initially built as
| an internal tool to discuss ai papers at Stanford. We've been
| dealing with a lot of growing pains/infra issues over the past
| month that we are in the process of hashing out. From there we
| would love to make a more concerted effort to share this in areas
| outside of AI. Happy to hear your thoughts here, or more formally
| via contact@alphaxiv.org.
|
| I do want to highlight, our site has a team of
| reviewers/moderators and having folks from different subject
| areas is critical to making sure the site doesn't end up a
| cesspool, apply here:
| https://docs.google.com/forms/d/11ve-4cL0axTDcqnHF66zX6greFV....
| bawolff wrote:
| In the search field, it would be kind of cool to list how many
| comments each paper has - e.g. if you want to find the most
| discussed papers on some topic.
| tintor wrote:
| Too much annoying visual clutter on the discussion page, unlike
| Hacker News.
| john-titor wrote:
| Tried to sign up with my corporate email (life sciences, 100k+
| employees worldwide with a big research arm). Says the
| institution is not known to the service. What's the process to
| get it known?
| rehaanahmad wrote:
| Email me at contact@alphaxiv.org, I'll add it asap!
| john-titor wrote:
| Thanks a lot, will do!
| parpfish wrote:
| Just had an idea that may help the moderation AND encourage
| higher levels of discourse -- comments are not published
| immediately.
|
| When I was doing peer reviews, it would often take a day or more
| to read a paper, think it through, and then write up something
| thoughtful and constructive.
|
| If you introduce a mechanism to delay comments (eg, holding all
| messages for 24-72 hours before publishing or only releasing new
| comments on Monday mornings) it would:
|
| - encourage commenters to write longer thoughtful responses
| rather than short quick comment threads
|
| - reduce back and forth flame wars
|
| - ease the burden on moderators and give them time to do batches
| of work
|
| - see if multiple commenters come to the same
| conclusions/critiques to minimize bandwagon effects
| data_maan wrote:
| What I don't like about this is that they had to build a separate
| system.
|
| Why wasn't it possible to contact arXiv and do this in
| collaboration with them?
___________________________________________________________________
(page generated 2024-09-08 23:00 UTC)