hngopher.com

       [HN Gopher] ArXiv declares independence from Cornell
       ___________________________________________________________________
        
       ArXiv declares independence from Cornell
        
       Author : bookstore-romeo
       Score  : 790 points
       Date   : 2026-03-20 04:24 UTC (1 days ago)
        
 (HTM) web link (www.science.org)
 (TXT) w3m dump (www.science.org)
        
       | adamnemecek wrote:
       | Good call, ArXiv seems like one of the most important
       | institutions out there right now.
        
         | p-e-w wrote:
         | It's so important, in fact, that there should be more than one
         | such institution.
         | 
         | People keep falling into the same trap. They love monopolies,
         | then are shocked when those monopolies jerk them around.
        
           | andbberger wrote:
           | there is. bioarxiv.
        
           | auggierose wrote:
           | I am using Zenodo for a while now instead. It is more user
           | friendly, as well.
        
             | mastermage wrote:
             | Zenodo is more for IT Papers and also datasets isn't it?
        
               | auggierose wrote:
               | It can host large datasets as well, yes. It is hosted by
               | CERN, so it is not specifically IT in any way. It also
               | allows you to restrict access to the files of your
               | submission. It has no requirements to submit your LaTeX
               | sources, any PDF will be fine. There are also no
               | restrictions on who can publish. You'll get a DOI, of
               | course.
               | 
               | Everything published on arXiv could also be published on
               | Zenodo, but not the other way around.
        
               | mastermage wrote:
               | oh interesting I didnt know this
        
               | jruohonen wrote:
               | Zenodo is great too, yes, but their meta-data management
               | is somewhat problematic; i.e., it can be changed at whim,
               | which makes indexing difficult.
        
             | Al-Khwarizmi wrote:
             | I like it as well, it works great. But I wonder if it would
             | scale if at some point there were a massive exodus from
             | arXiv.
        
               | auggierose wrote:
               | I think it already hosts much more data than arXiv, given
               | that they also host large datasets.
        
           | freehorse wrote:
           | It is just a preprint repository. It is pretty open (the
           | stories where a preprint was rejected or delayed unreasonably
           | are extremely rare). It offers the basic services for a
           | math/compsci/physics themed preprint repository.
           | 
           | I don't see much of a monopoly, nor any "moat" apart from it
           | being recognised. You can already post preprints on a
           | personal website or on github, and there are "alternatives"
           | such as researchgate that can also host preprints, or zenodo.
           | There are also some lesser known alternatives even. I do not
           | see anything special in hosting preprints online apart from
           | the convenience of being able to have a centralised place to
           | place them and search for them (which you call "monopoly").
           | If anything, the recognisability and centrality of arxiv
           | helped a lot the old, darker days to establish open access to
           | papers. There was a time when many journals would not let you
           | publish a preprint, or have all kinds of weird rules when you
           | can and when you can't. Probably still to some degree.
        
         | koakuma-chan wrote:
         | it just hosts pdfs, no?
        
           | aragilar wrote:
           | It does do a fair amount of filtering of submissions, and
           | it's a long term archive (e.g. for the next 100+ years). I
           | suspect both (but with the former dominating) are the issue.
        
             | bonoboTP wrote:
             | Just put out a torrent and people of the sort at
             | r/DataHoarder will keep it alive for longer than
             | bureaucrats.
        
           | pfortuny wrote:
           | Also the sources and has a very tame but useful pre-
           | acceptance process.
        
           | freehorse wrote:
           | Well, technically, it can also compile your tex file if you
           | upload the tex file instead of the pdf directly, which helps
           | a lot in standardizing the stylistic structure between
           | preprints. Most other repositories are wild west and
           | inconsistent. I really appreciate the similarity in style
           | applied to most preprints there. Moreover, this means you can
           | also download not just the pdf, but the source tex file to,
           | which can be very useful.
        
             | bonoboTP wrote:
             | The similarity in style comes from conference and journal
             | templates, not from Arxiv. You can style your paper with
             | latex in any style, Arxiv doesn't care. On Arxiv you mostly
             | see preprints that people submit to conferences and
             | journals and they enforce the style.
        
           | IshKebab wrote:
           | Technically yes, socially no.
        
         | kergonath wrote:
         | The French government put a bit of money on the table to help
         | researchers fulfil their open science requirements for
         | government and EU grants, and funded the HAL repository (
         | https://hal.science/ ). It's much smaller than arXiv, but it
         | exists. In other countries like the UK there are clusters of
         | smaller repositories as well, but it's not as well centralised.
        
       | dataflow wrote:
       | This sounds terrible. Of _course_ there 's a huge risk of it
       | becoming made for-profit. It almost makes you wonder if the
       | academic publishers are behind this push somehow.
       | 
       | Could they not have made it into some legal structure that puts
       | universities at the top? Say, with a bunch of universities owning
       | shares that comprise the entirety of the ownership of arXiv, but
       | that would allow arXiv to independently raise funds?
        
         | gucci-on-fleek wrote:
         | > Of course there's a huge risk of it becoming made for-profit.
         | 
         | The article says that "it will become an independent nonprofit
         | corporation", and as OpenAI's failed attempt showed, converting
         | a non-profit to a for-profit organization is either _really_
         | hard or impossible.
         | 
         | > Could they not have made it into some legal structure that
         | puts universities at the top?
         | 
         | As a corporation (even a non-profit one), it will have a board
         | of directors. I have no idea what their charter will look like,
         | but I would be surprised if at least one seat wasn't reserved
         | for a university representative, and more than that seems quite
         | likely as well.
        
           | MostlyStable wrote:
           | OpenAI didn't get everything that they wanted, but I very
           | much disagree with calling it a "failed attempt". The non-
           | profit went from owning the entirety of OpenAI to having ~25%
           | stake.
        
             | gucci-on-fleek wrote:
             | Ah, thanks for the correction.
        
             | ronsor wrote:
             | Sam Altman is a special kind of person; not many could pull
             | off the schemes he does.
        
               | gentleman11 wrote:
               | I doubt it was him who architected it. A team of lawful
               | evil lawyers more likely
        
             | cbolton wrote:
             | The non-profit still controls the board doesn't it?
        
               | weedhopper wrote:
               | As shown by Altman, not really.
        
           | mort96 wrote:
           | Is your argument _really_ that  "OpenAI was an independent
           | nonprofit corporation and it worked out great, Arxiv will
           | remain just as non-profit as OpenAI"?
        
             | gucci-on-fleek wrote:
             | No, my argument is that OpenAI could make billions of
             | dollars if they converted from a non-profit to a for-
             | profit, and they only succeeded after years of effort and
             | because they had already structured the company into
             | separate for-profit and non-profit entities. And even after
             | all this, the non-profit still controls the majority of the
             | for-profit entity.
             | 
             | So if OpenAI with billions of dollars only partially
             | succeeded at converting to a for-profit business, then that
             | suggests that organizations with fewer resources (like
             | arXiv) have much worse odds.
        
       | halperter wrote:
       | Statement by arXiv: https://tech.cornell.edu/arxiv/
        
         | reed1234 wrote:
         | Should be the main link. The original article is based on the
         | CEO job posting.
        
       | tornikeo wrote:
       | Now the question is, will arxiv wage a decade long bloody war
       | with Cornell, using heavy infantry (PhD students), archers
       | (reviewers) and field artillery (AI slop papers), or will the
       | independence be mostly peaceful? Only time can tell.
        
         | alansaber wrote:
         | PhD students are levy infantry at best with Postdocs being the
         | armoured levies.
        
           | dmos62 wrote:
           | Is this Gondor or Mordor?
        
       | psalminen wrote:
       | I might be missing something, but I still don't get the why. I
       | don't see any "problem" that needs to be solved.
        
         | kolinko wrote:
         | The article lists the reasons quite clearly.
        
           | binsquare wrote:
           | For everyone else,
           | 
           | The reason is because arxiv is growing significantly leading
           | to 297,000 deficit in operating costs for 2025 alone.
           | Corenell has helped with donation a long with other
           | organizations that pay membership fees.
           | 
           | As a result, donors + leaders of arxiv think it's best to
           | spin off to increase funding.
        
             | vl wrote:
             | What is unclear why they need stuff of 27 and 6.7 million
             | to operate essentially static hosting website in 2026.
        
               | swiftcoder wrote:
               | The "essentially static hosting" isn't the cost centre
               | (although with 5 million MAU, it's nothing to sneeze at).
               | The real costs are on the input side - they have an
               | ingestion pipeline that ensures standardised paper
               | formatting and so on, plus at least some degree of human
               | review.
        
               | bonoboTP wrote:
               | Do you mean that the CPU compute cost of turning latex
               | into pdf/HTML is the main cost?
        
               | swiftcoder wrote:
               | No, I mean that the pipeline requires software engineers
               | to build/maintain, and salaries are (as in basically
               | every tech organisation) the dominant cost
        
               | bonoboTP wrote:
               | Then drop it and make people upload a pdf and a zip of
               | the latex sources.
               | 
               | Most people I talk to hate that pipeline and spend a lot
               | of debug hours on it when Arxiv can't compile what
               | overleaf and your local latex install can.
        
               | domoritz wrote:
               | Arxiv can recompile latex to support accessibility and
               | html. Going to pdf submissions would be a major step
               | backward.
        
               | bonoboTP wrote:
               | Make it an external service then, and leave the thing
               | that's already working great to just be.
               | 
               | The reason authors like and use arxiv is that it gives 1)
               | a timestamp, 2) a standardized citable ID, and 3) stable
               | hosting of the pdf. And readers like the no-nonsense
               | single click download of the pdf and a barebones
               | consistent website look.
               | 
               | All else is a side show.
        
               | OneDeuxTriSeiGo wrote:
               | You have to keep in mind that an increasing portion of
               | their time and labor is going towards moderation and
               | filtering due to a mass influx of nonsensical AI
               | generated papers, non-academic numerology-tier hackery,
               | and other useless drivel.
               | 
               | Spinning the service off forces other the labor out onto
               | other universities rather than leaving them to solely
               | Cornell
        
               | bonoboTP wrote:
               | Is the problem the storage cost for hosting them, the
               | HDDs? I'm sure they can be offloaded to cold storage
               | because most of that slop won't be opened by anyone.
               | 
               | Arxiv doesn't need moderation. Nobody is asking for Arxiv
               | moderation. It needs minimal checks to remove overtly
               | illegal content.
        
               | swiftcoder wrote:
               | > Arxiv doesn't need moderation. Nobody is asking for
               | Arxiv moderation
               | 
               | Seems like a lot of people _are_ asking for moderation.
               | And moderation is a pretty big part of the existing
               | offering[1].
               | 
               | [1]: https://info.arxiv.org/help/moderation/index.html
        
               | OneDeuxTriSeiGo wrote:
               | > Is the problem the storage cost for hosting them, the
               | HDDs?
               | 
               | No. Around half the cost is infrastructure. The other
               | half of the cost is people. i.e. engineers to maintain
               | infra and build mod tools for moderators to operate.
               | 
               | > Arxiv doesn't need moderation. Nobody is asking for
               | Arxiv moderation.
               | 
               | This is just not true. Tons of people ask for arxiv to
               | have moderation. Especially since covid, etc when
               | antivaxxers and alternative medicine peddlers started
               | trying to pump the medical categories of arxiv with quack
               | science preprints and then go on to use the arxiv
               | preprint and its DOI to take advantage of non academics
               | who don't really understand what arxiv is other than it
               | looks vaguely like a journal.
               | 
               | And doubly so now that people keep submitting AI
               | generated slop papers to the service trying to flood the
               | different categories so they can pad their resumes or
               | CVs. And on top of that people who don't actually
               | understand the fields they are trying to write papers in
               | using AI to generate "innovative papers" that are
               | completely nonsensical but vaguely parroting the terms of
               | art.
               | 
               | The only reason you don't see more people calling for
               | arxiv moderation is because they already spend so much
               | time on it. If they were to stop moderating the site it
               | would overflow into an absolute nightmare of garbage near
               | overnight. And people wouldn't be upset with the users
               | uploading this of course, they'd be upset with arxiv for
               | failing to take action.
               | 
               | Moderation is inherently unappreciated because in the
               | ideal form it should be effectively invisible (which
               | arxiv's mostly is).
               | 
               | If you want to see the type of stuff that arxiv keeps
               | out, go over to ViXrA [1] or you can watch k-theory's
               | video [2] having fun digging through some of the quality
               | posts that live over on that site.
               | 
               | 1. https://en.wikipedia.org/wiki/ViXra
               | 
               | 2. https://www.youtube.com/watch?v=1at9BjQP8CI
        
               | lou1306 wrote:
               | The PDF formatting is all but standardised. They ingest
               | LaTeX sources, which is formatted according to the
               | authors' whims (most likely, according to whatever
               | journal or conference they just submitted the manuscript
               | to). I'll concede that the (relatively novel) HTML
               | formatter gives paper a more uniform appearance. They
               | also integrate a bunch of external services for e.g.,
               | citation metrics and cross-references. Still hard to
               | justify such a high cost to operate, but eh.
               | 
               | Also, the "human review" is a simple moderation process
               | [1]. It usually does not dig into the submission's
               | scientific merits.
               | 
               | [1] https://info.arxiv.org/help/moderation/index.html
        
               | OtherShrezzing wrote:
               | I don't see it as an especially exuberant structure or
               | budget. I've seen larger teams with bigger budgets
               | struggle to maintain smaller applications.
               | 
               | I've contracted into some consultancy teams which you
               | could uncharitably describe as "15 people and $4mn/yr to
               | create one PDF per month".
        
             | sanex wrote:
             | Now they're going to have a deficit of 600,000 in operating
             | costs.
        
             | pessimizer wrote:
             | > The reason is because arxiv is growing significantly
             | leading to 297,000 deficit in operating costs for 2025
             | alone.
             | 
             | Dollars? So 300 people's cable bill? That's basically
             | nothing. They're spending too much, and it's still nothing,
             | and the solution is going to be to privatize it and
             | eventually loot it.
             | 
             | You can't hand out a collection plate and get $300K for
             | Arxiv? Your local neighborhood church can. Civilization is
             | obviously collapsing.
        
         | u1hcw9nx wrote:
         | I think the problem described in 6th paragraph needs to be
         | solved.
        
       | davnicwil wrote:
       | Very unrelated to the article, but I think 'arXiv' as a brand is
       | bad, and really detrimental to what the institution aims to
       | accomplish.
       | 
       | That is, it's not readily parseable, it really gives an insider
       | term vibe - like this isn't for you if you don't already know
       | what it means or how you should read or say it. It sort of
       | reminds me of the overuse of latin and latinate terms generally
       | in the old professions and, well, the academy.
       | 
       | Just always struck me as being somewhat at odds with the goal.
        
         | john-titor wrote:
         | I wonder what makes you feel that. I've been publishing
         | preprints close to a decade on arxiv now and never had any
         | particular feelings about it.
         | 
         | To me it's just a way to get out your work fast, so that there
         | is already a trace of it on the Internets - nothing more and
         | nothing less.
         | 
         | > That is, it's not readily parseable, it really gives an
         | insider term vibe...
         | 
         | Isn't that normal with highly specialized research fields? I
         | agree many papers could benefit from clearer wording, but
         | working in a niche means you sometimes don't reach a broader
         | audience
        
           | davnicwil wrote:
           | It's an opinion, and you feeling no particular way about it
           | is equally valid.
           | 
           | But I did justify and maybe to reword slightly, surely if one
           | of the main drivers is opening up research, the brand name
           | should be something that's less obscure and more accessible /
           | understandable as to what it is on first sight?
           | 
           | Maybe arXiv evoking the word 'archive' with an ancient Greek
           | twist does that for some, but it's clearly a bit cryptic for
           | many, and if the point is to open up probably the brand
           | should just be something much plainer.
        
             | aragilar wrote:
             | No, it's to be a pre-print server. If someone doesn't know
             | what that means, then they shouldn't be using arXiv.
        
               | davnicwil wrote:
               | everyone has a first time they see a thing and don't yet
               | know what it is.
               | 
               | Using a brand as a filter where you have to already know
               | what it means to get it is exactly the opposite of what
               | it's supposed to achieve.
               | 
               | Consider the most exclusive (successful) brands that
               | exist. Even there, where exclusivity is a brand goal,
               | none of them have this property of being obscure on first
               | contact.
        
               | bonoboTP wrote:
               | You usually get introduced to it by your academic
               | supervisor or collaborators as a masters or PhD student.
               | If you're a solo researcher who has made a significant
               | contribution on the frontier of science, I'm sure you'll
               | be able to understand how Arxiv works as well. Because I
               | assume you have had some conversations with other experts
               | in the field. If you're a full on autodidact with no
               | contact to any other researchers in the field, well,
               | maybe it's better if you chat with some other people in
               | that field.
               | 
               | Its reasonable to have a tradeoff here to avoid cranks
               | and now AI psychosis slop. You can still post on research
               | gate and academia.edu or you own github page or
               | webhosting.
        
             | Cordiali wrote:
             | I've never even connected the 'X' to the Greek letter chi.
             | I just kinda accepted it as one of many groovy web 2.0
             | misspellings in search of a domain and trademark.
        
               | matt-noonan wrote:
               | This is particularly funny because arXiv doesn't just
               | predate Web 2.0, it nearly predates the public web
               | entirely (only missing it by about two weeks)
        
         | nixon_why69 wrote:
         | > like this isn't for you if you don't already know what it
         | means
         | 
         | Isn't that actually kindof a good brand signal for a repo of
         | very specialized papers? "Fun with learning" in comic sans
         | wouldn't help credibility.
        
         | vasco wrote:
         | This the type of guy that will suggest paper.ly as a better
         | name with a straight face and then we wonder why the internet
         | is turning to shit
        
         | jltsiren wrote:
         | It's a classic story of someone having to pick a name quickly,
         | which then gets established long before anyone who cares about
         | branding is aware of its existence.
         | 
         | The original service didn't even have a name, only a
         | description, and it was amusingly hosted at xxx.lanl.gov. But
         | LANL wasn't really interested in it, and the founder eventually
         | left for Cornell. At that point, the service needed a domain
         | name, but archive.org was already taken.
         | 
         | And besides, the name has Ancient Greek influences. A similar
         | Latinate term might be something like "archive".
        
           | davnicwil wrote:
           | Interesting, thanks for the context! Makes it more
           | understandable as a choice.
        
           | bonoboTP wrote:
           | I thought the X was an allusion to LaTeX.
        
             | jltsiren wrote:
             | Usually, when you see "ch" in a Latin word, it represents a
             | "kh" in the original Greek word. Both TeX and arXiv use "X"
             | to represent it instead. TeX because Knuth chose to be
             | fancy, and arXiv because "archive" was no longer available.
        
         | vulcan01 wrote:
         | By your criterion, Google, Apple, and Amazon are terrible names
         | as well.
        
           | davnicwil wrote:
           | > if you don't already know what it means or how you should
           | read or say it
           | 
           | Google I'll grant you, though it's still pretty phonetic and
           | easy to read. The other two not at all, they're incredibly
           | well known instantaneously recognisable words.
        
         | spiralcoaster wrote:
         | You're right. The name is just classic gatekeeping and elitist,
         | clearly. I am 100% certain that's why they chose it. If they
         | really cared about inclusion, they would have called it
         | research.io
        
       | OutOfHere wrote:
       | With 300K for the CEO, its enshittification will commence
       | imminently. It will now serve to maximize revenue. Just wait and
       | watch while they issue a premium membership, payment requirements
       | for authors, and other revenue generators to please their
       | investors.
        
         | exe34 wrote:
         | they'll just turn into a shitty journal at this point, they
         | just need to introduce peer review and they can start competing
         | with the real journals on price point.
         | 
         | another will need to rise to take its place.
        
           | OutOfHere wrote:
           | > they'll just turn into a shitty journal at this point
           | 
           | To this end, they added an endorsement requirement this year:
           | https://blog.arxiv.org/2026/01/21/attention-authors-
           | updated-...
        
       | Peteragain wrote:
       | .. and soon to be dependent on US military funding? Controlled by
       | someone who has run-ins with universities? This'll end in tears.
        
       | Garlef wrote:
       | Maybe they should implement a graph based trust system:
       | 
       | You need your favourite academic gatekeeper (= thesis advisor) to
       | vouch for you in order to be allowed to upload.
       | 
       | Then AI slop gets flagged and the shame spreads through the
       | graph. And flaggings need to have evidence attached that can
       | again be flagged.
        
         | dmos62 wrote:
         | I've often thought that similar trust systems would work well
         | in social media, web search, etc., but I've never seen it
         | implemented in a meaningful way. I wonder what I'm missing.
        
           | IshKebab wrote:
           | Lobsters has this I think. But it also means I've never
           | posted there.
        
         | pred_ wrote:
         | The endorsement system already works along that line:
         | https://info.arxiv.org/help/endorsement.html
         | 
         | It's probably not perfect but in practice, it seems to have
         | been enough to get rid of the worst crackpotty spam.
        
         | ryangibb wrote:
         | You mean like endorsement?
         | https://info.arxiv.org/help/endorsement.html
        
         | justinnk wrote:
         | They already had a basic form of this for a while [1]
         | 
         | > arXiv requires that users be endorsed before submitting their
         | first paper to arXiv or a new category.
         | 
         | [1] https://info.arxiv.org/help/endorsement.html
        
         | ChrisGreenHeur wrote:
         | Science reduced to people with a phd?
        
           | budman1 wrote:
           | not a bad first order filter.
           | 
           | can you think of a better one?
        
             | awesome_dude wrote:
             | The whole point of the scientific method was that we could
             | ignore the source of the information, and were instead
             | expected to focus on the value of the information based on
             | supporting evidence (data).
             | 
             | If we go back to "Only people that have been inducted into
             | the community can publish science" we're effectively saying
             | that only the high priests can accrue knowledge.
             | 
             | I say this knowing full well that we have a massive problem
             | in science on sorting the wheat from the chaff, have had so
             | for a VERY long time, and AI is flooding the zone (thank
             | you political commentator I despise) with absolute dross.
        
       | frankling_ wrote:
       | The recent announcement to reject review articles and position
       | papers already smelled like a shift towards a more "opinionated"
       | stance, and this move smells worse.
       | 
       | The vacuum that arXiv originally filled was one of a glorified
       | PDF hosting service with just enough of a reputation to allow
       | some preprints to be cited in a formally published paper, and
       | with just enough moderation to not devolve into spam and chaos.
       | It has also been instrumental in pushing publishers towards open
       | access (i.e., to finally give up).
       | 
       | Unfortunately, over the years, arXiv has become something like a
       | "venue" in its own right, particularly in ML, with some decently
       | cited papers never formally published and "preprints" being cited
       | left and right. Consider the impression you get when seeing a
       | reference to an arXiv preprint vs. a link to an author's
       | institutional website.
       | 
       | In my view, arXiv fulfills its function better the less power it
       | has as an institution, and I thus have exactly zero trust that
       | the split from Cornell is driven by that function. We've seen the
       | kind of appeasement prose from their statement and FAQ [1]
       | countless times before, and it's now time for the usual routine
       | of snapshotting the site to watch the inevitable amendments to
       | the mission statement.
       | 
       | "What positive changes should users expect to see?" - I guess the
       | negative ones we'll have to see for ourselves.
       | 
       | [1] https://tech.cornell.edu/arxiv/
        
         | hijodelsol wrote:
         | I came here to say something similar. As someone who works in a
         | field that applies machine learning but is not purely focused
         | on it, I interact with people who think that arXiv is the only
         | relevant platform and that they don't need to submit their work
         | to any journal, as well as people who still think that
         | preprints don't count at all and that data isn't published
         | until it's printed in an academic journal. It can feel like a
         | clash of worlds.
         | 
         | I think both sides could learn from the other. In the case of
         | ML, I understand the desire to move fast and that average time
         | to publication of 250-300 days in some of the top-tier journals
         | can feel like an unnecessary burden. But having been on both
         | sides of peer review, there is value to the system and it has
         | made for better work.
         | 
         | Not doing any of it follows the same spirit as not benchmarking
         | your approach against more than maybe one alternative and that
         | already as an after-thought. Or benchmaxxing but not exploring
         | the actual real-world consequences, time and cost trade offs,
         | etc.
         | 
         | Now, is academic publishing perfect? Of course not, very very
         | far from it. It desperately needs to be reformed to keep it
         | economically accessible, time efficient for both authors,
         | editors and peer reviewers and to prevent the "hot topic of the
         | day" from dominating journals and making sure that peer review
         | aligns with the needs of the community and actually improves
         | the quality of the work, rather than having "malicious peer
         | review" to get some citations or pet peeves in.
         | 
         | Given the power that the ML field holds and the interesting
         | experiments with open review, I would wish for the field to
         | engage more with the scientific system at large and perhaps try
         | to drive reforms and improve it, rather than completely
         | abandoning it and treating a PDF hosting service as a journal
         | (ofc, preprints would still be desirable and are important, but
         | they can not carry the entire field alone).
        
           | bonoboTP wrote:
           | Simply anticipating basic push backs from reviewers makes
           | sure that you do a somewhat thorough job. Not 100% thorough
           | and the reviews are sometimes frivolous and lazy and stupid.
           | But just knowing that what you put out there has to pass the
           | admittedly noisily gatekept gate of peer review overall
           | improves papers in my estimation. There is also a negative
           | side because people try to hide limitations and honest
           | assessments and cherry pick and curate their tables more in
           | anticipation of knee jerk reviewers but overall I think
           | without any peer review, author culture would become much
           | more lax and bombastic and generally trend toward engagement
           | bait and social media attention optimized stuff.
           | 
           | The current balance where people wrote a paper with reviers
           | in mind, upload it to Arxiv before the review concludes and
           | keep it on Arxiv even if rejected is a nice balance. People
           | get to form their own opinion on it but there is also enough
           | self-imposed quality control on it just due to wanting it to
           | pass peer review, that even if it doesn't pass peer review,
           | it is still better than if people write it in a way that
           | doesn't care or anticipate peer review. And this works
           | because people are somewhat incentivized to get peer reviewed
           | official publications too. But being rejected is not the end
           | of the world either because people can already read it and
           | build on it based on Arxiv.
        
             | bjourne wrote:
             | I really am not sure about that:
             | https://biologue.plos.org/wp-
             | content/uploads/sites/7/2020/05...
             | 
             | The problem is that "optimizing for peer-review" is not the
             | same thing as optimizing for quality. E.g., I like to add a
             | few tongue-in-cheeks to entertain the reader. But then I
             | have to worry endlessly about anal-retentive reviewers who
             | refuse to see the big picture.
        
               | bonoboTP wrote:
               | Currently a kind of rule of thumb is that a PhD student
               | can graduate after approximately 3 papers published in a
               | good peer reviewed venue.
               | 
               | If peer review were to go away, this whole academic
               | system would get into a crisis. It's dysfunctional and
               | has many problems but it's kinda load bearing for the
               | system to chug along.
        
               | DANmode wrote:
               | No hard rule, no crisis.
               | 
               | Maybe we can go back to very opinionated "true" academia,
               | 
               | where there are institutional gatekeepers,
               | 
               | but they _mostly_ get it right on who to award (and not),
               | 
               | vs the current game of
               | 
               | "whoever plays ball with funding sources the best = the
               | best academic",
               | 
               | which is obviously bullshit.
        
               | vkou wrote:
               | You'll still need to convince the purseholders to pay
               | you, and they'll want some objective metric to measure
               | your output, and whatever metric they pick will be gamed.
        
               | DANmode wrote:
               | The point of my comment was,
               | 
               | in much earlier institutions of knowledge and excellence,
               | 
               | the only transparent metric was whether or not they
               | approved you.
        
               | vkou wrote:
               | That ossifies intellectual monocultures, though. (Or,
               | heaven forbid, if someone has a financial conflict of
               | interest in the private sphere...)
        
               | DANmode wrote:
               | The current solution _doesn't_ resist capture by capital
               | either,
               | 
               | and indeed we're _already left_ with all of the things
               | claimed - the worst of both worlds, really.
        
               | fc417fc802 wrote:
               | But this is already how the purse holders operate. A big
               | group of experts get together and vote on which grant
               | proposals within a given category to fund.
               | 
               | I think it comes down to how the system is structured and
               | how many players there are. The more difficult it is for
               | a small cult to capture control of the funding (or access
               | to instrumentation or awarding of degrees or whatever)
               | for a given area the less likely you are to end up with a
               | monoculture.
               | 
               | Assuming the majority of the funding continues to come
               | from governments then you have a centralized point of
               | leverage that can shape the system. So it should be
               | possible to impose constraints that result in a system
               | that actively prevents monocultures from developing.
        
               | mitthrowaway2 wrote:
               | Maybe their institution should evaluate whether their
               | papers pass muster? It's the one conferring the degree.
        
           | StableAlkyne wrote:
           | I've noticed it's field dependent. Some fields don't really
           | feel much need to publish in a real journal.
           | 
           | Others (at least in chemistry) will accept it, but it raises
           | concern if a paper is _only_ available as a preprint.
        
           | pie_flavor wrote:
           | You may have delivered value in peer review, but on the
           | whole, peer review delivers negative value.
           | https://www.experimental-history.com/p/the-rise-and-fall-
           | of-...
           | 
           | The arXiv vs journal debate seems a lot like 'should the work
           | get done, or should the work get certified' that you see all
           | over 'institutions', and if the certification does not
           | actually catch frauds or errors, it's not making the
           | foundations stronger, which is usually the only justification
           | for the latter side.
        
             | fc417fc802 wrote:
             | Can't say I agree with that position.
             | 
             | Responding largely to the linked article, you can't just
             | ignore the massive increase in funding and associated
             | output that occurred. Scaling almost any system up will be
             | expected to result in creative new failure modes. It's easy
             | to observe that a system isn't great and suppose that
             | removing it would improve things but this very often isn't
             | the case. Democracy is one such example.
             | 
             | There's also the publishing ecosystem that developed around
             | the increased funding. It isn't clear to me why any blame
             | (if it's even valid, see preceding paragraph) should be
             | laid at the feet of the practice of peer reviewing
             | publications rather than such an obviously dysfunctional
             | institution.
             | 
             | Even if we accept the way in which publications have been
             | undergoing peer review to somehow be the root of all evil
             | (as opposed to the for profit publication of taxpayer
             | funded work) - there's more than one way to go about it! A
             | glaringly obvious problem, mentioned in the linked article
             | yet not meaningfully addressed that I saw, is that peer
             | reviewers aren't paid. If this was a compensated task
             | presumably it would be performed much more rigorously.
             | Building inspectors aren't volunteers and they seem to do a
             | good enough job.
        
           | observationist wrote:
           | What's the value of academic publishing over the arxiv model
           | of freely publishing, free access, and a global, vigorous
           | discussion across a wide range of platforms, with experts,
           | researchers, amateurs, institutions, and the peanut gallery
           | all having the opportunity to participate?
           | 
           | What possible value does a journal like Nature, for example,
           | bring to the table by claiming a paper for themselves and
           | charging people for it, given the alternative?
           | 
           | I don't see any value there. Maintaining an exclusive clique
           | by using artificial scarcity while coasting on the dregs of
           | reputation remaining to a once prestigious institution is
           | what a lot of these journals are doing.
           | 
           | The world has changed. There's no need for that sort of pay
           | to play gatekeeping, and in fact, the model does tremendous
           | damage to academic and intellectual integrity. It allows
           | people to get away with fraud and it makes the institutions
           | motivated to hide and cover it up so as to not damage their
           | own reputations by admitting anything slipped by them.
           | 
           | If you contrast the damage done by journals, with regards to
           | suppressed research, gatekept access, money taken from
           | researchers and readers alike, against the value they might
           | plausibly provide, the answer is clear.
           | 
           | They're not needed anymore. The AI era, since 2017, has
           | thoroughly demonstrated that journals are materially
           | incapable of keeping up, that they're unable to meaningfully
           | contribute to the field, and that their curation or other
           | involvement has no effective practical value. The same is
           | true for other fields, but everyone involved wants to keep
           | their piece of the grift going as long as possible.
           | 
           | We don't need them, anymore. I suspect we never did.
        
             | jltsiren wrote:
             | The value is the ability to do science as a career without
             | being independently wealthy.
             | 
             | Politicians, administrators, donors, and taxpayers don't
             | want scientists deciding on their own how to spend the
             | money. They want control over what gets funded. They want
             | funding decisions with justifications they can understand.
             | But they don't understand the science itself, so they need
             | "objective" metrics to support the decisions. And because
             | those metrics matter, people will inevitably game them.
        
         | ph4rsikal wrote:
         | My observation is that research, especially in AI has left
         | universities, which are now focusing their research to a lesser
         | degree on STEM. It appears research is now done by companies
         | like Meta, OpenAI, Anthropic, Tencent, Alibaba, among many
         | others.
        
           | bonoboTP wrote:
           | Universities (outside a few) just have much weaker PR
           | machines so you never hear what they do. Also their work is
           | not user facing products so regular people, even tech power
           | users won't see them.
        
             | 0x3f wrote:
             | Not sure about that. How would a university test scaling
             | hypotheses in AI, for example? The level of funding
             | required is just not there, as far as I know.
        
               | rsfern wrote:
               | This issue of accessibility is widely acknowledged in the
               | academic literature, but it doesn't mean that only large
               | companies are doing good research.
               | 
               | Personally I think this resource mismatch can help drive
               | creative choice of research problems that don't require
               | massive resources. To misquote Feynman, there's plenty of
               | room at the bottom
        
               | oscaracso wrote:
               | Universities are also not suited to test which race car
               | is the fastest, but that does not obviate the need for
               | academic research in mechanical engineering.
        
               | 0x3f wrote:
               | Perhaps but the fastest race car is not possibly
               | marshalling in the end of human involvement in science,
               | so you might consider these of considerably different
               | levels of meriting the funding.
        
               | oscaracso wrote:
               | >marshalling in the end of human involvement in science
               | 
               | Good riddance! But not relevant in the least.
        
               | 0x3f wrote:
               | Impact size is not relevant to funding allocation?
        
               | oscaracso wrote:
               | Your attempts to smuggle your conclusions into the
               | conversation are becoming tiresome. Profiling a private
               | company's computer program is not impactful research. The
               | best-fit parameters AI people call scaling exponents are
               | not properties like the proton lifetime or electron
               | electric dipole moment. Rest assured, there remain
               | scientists at universities producing important work on
               | machine learning.
        
               | bonoboTP wrote:
               | There are a million other research things to do besides
               | running huge pretraining runs and hyperparam grid search
               | on giant clusters. To see what, you can start with
               | checking out the best paper and similar awards at
               | neurips, cvpr, iccv, iclr, icml etc.
        
             | tzs wrote:
             | I came across a good example of that a few years ago.
             | Caltech had a page on their site listing Caltech startups.
             | 
             | There were quit a few off them--by number of starts per
             | year per person Caltech was actually generating startups at
             | a higher rate than Stanford. But almost none of those
             | Caltech startups were doing anything that would bring them
             | to the public's attention, or even to the average HN
             | reader's attention.
             | 
             | For example one I remember was a company developing
             | improved ion thrusters for spacecraft. Another was doing
             | something to automate processing samples in medical labs.
             | 
             | Also almost none of them were the "undergraduates drop out
             | to form a company" startup we often hear about, where the
             | founders aren't actually using much that they actually
             | learned at the school, with the school functioning more as
             | a place that brought the founders together.
             | 
             | The Caltech startups were most often formed by professors
             | and grad students, and sometimes undergraduates that were
             | on their research team, and were formed to commercialize
             | their research.
             | 
             | My guess is that this is how it is at a lot of
             | universities.
        
               | Fomite wrote:
               | Every university I've worked in has been dominated by
               | this paradigm, has an office set up to support it, and a
               | bunch of policies around what it means for your doctoral
               | supervisor to also be your employer, etc.
        
           | PaulHoule wrote:
           | That's a specific field at a very specific time. In general
           | there is a difference between research and development,
           | you're going to expect the early work to be done in academia
           | but the work to turn that into a product is done by
           | commercial organizations.
           | 
           | You get ahead as an academic computer scientist, for
           | instance, by writing papers not by writing software. Now
           | there really are brilliant software developers in academic CS
           | but most researchers wrote something that kinda works and
           | give a conference talk about it -- and that's OK because the
           | work to make something you can give a talk about is probably
           | 20% of the work it would take to make something you can put
           | in front of customers.
           | 
           | Because of that there are certain things academic researchers
           | really can't do.
           | 
           | As I see it my experience in getting a PhD and my experience
           | in startups is essentially the same: "how do you do make
           | doing things nobody has ever done before routine?" Talk to
           | people in either culture and you see the PhD students are
           | thinking about either working in academia or a very short
           | list of big prestigious companies and people at startups are
           | sure the PhDs are too pedantic about everything.
           | 
           | It took me a long time of looking at other people's side
           | projects that are usually "I want to learn programming
           | language X", "I want to rewrite something from _Software
           | Tools_ in Rust " to realize just how foreign that kind of
           | creative thinking is to people -- I've seen it for a long
           | time that a side project is not worth doing unless: (1) I
           | really need the product or (2) I can show people something
           | they've never seen before or better yet both. These sound
           | different, but if something doesn't satisfy (2) you can can
           | usually satisfy (1) off the shelf. It just amazes me how many
           | type (2) things stay novel even after 20 years of waiting.
        
         | stared wrote:
         | > arXiv fulfills its function better the less power it has as
         | an institution
         | 
         | It is an interesting instance of the rule of least power,
         | https://en.wikipedia.org/wiki/Rule_of_least_power.
        
           | fidotron wrote:
           | The irony of the TBL quotes there being the entire problem
           | with the semantic web is the ontological tarpit that results
           | due to the excessive expressive power of a general triple
           | store.
        
             | PaulHoule wrote:
             | Well, I'd argue that many things in the semweb are not
             | expressive enough and lead to the misunderstandings we
             | have.
             | 
             | People think, for instance, that RDFS and OWL are meant to
             | SHACL people into bad an over engineered ontologies. The
             | problem is these standards _add_ facts and don't subtract
             | facts. At risk of sounding like ChatGPT: it's a data
             | transformation system not a validation system.
             | 
             | That is, you're supposed to use RDFS to say something like
             | ?s :myTermForLength ?o -> ?s :yourTermForLength ?o .
             | 
             | The point of the namespace system is not to harass you, it
             | is to be able to suck in data from unlimited sources and
             | transform it. Trouble is it can't do the simple math
             | required to do that for real, like                 ?s
             | :lengthInFeet ?o -> ?s :lengthInInches 12*?o .
             | 
             | Because if you were trying OWL-style reasoning over
             | arithmetic you would run into Kurt Godel kinds of problems.
             | Meanwhile you can't subtract facts that fail validation,
             | you can't subtract facts that you just don't need in the
             | next round of processing. It would have made sense to
             | promote SHACL first instead of OWL because garbage-in-
             | garbage out, you are not going to reason successfully
             | unless you have clean data... but what the hell do I know,
             | I'm just an applications programmer who models business
             | processes enough to automate them.
             | 
             | Similarly the problem of ordered collections has never been
             | dealt with properly in that world. PostgreSQL, N1QL and
             | other post-relational and document DB languages can write
             | queries involving ordered collections easily. I can write
             | rather unobvious queries by hand to handle a lot of cases
             | (wrote a paper about it) but I can't cover all the cases
             | and I know back in the day I could write SPAQL queries much
             | better than the average RDF postdoc or professor.
             | 
             | As for underengineering, Dublin Core came out when I worked
             | at a research library and it just doesn't come close in
             | capability to MARC from 1970. Larry Masinter over at Adobe
             | had to hack the standard to handle ordered collections
             | because... the authors of a paper sure as hell care what
             | order you write their names in. And it is all like that:
             | RDF standards neglect basic requirements that they need to
             | be useful and then all the complex/complicated stuff really
             | stands out. If you could get the basics done maybe people
             | would use them but they don't.
        
         | light_hue_1 wrote:
         | > Unfortunately, over the years, arXiv has become something
         | like a "venue" in its own right, particularly in ML, with some
         | decently cited papers never formally published and "preprints"
         | being cited left and right. Consider the impression you get
         | when seeing a reference to an arXiv preprint vs. a link to an
         | author's institutional website.
         | 
         | This just isn't true. arXiv is not a venue. There's no place
         | that gives you credit for arXiv papers. No one cares if you
         | cite an arXiv paper or some random website. The vast vast
         | majority of papers that have any kind of attention or citations
         | are published in another venue.
        
           | contubernio wrote:
           | A Fields medal was awarded based mainly on this paper never
           | published elsewhere: https://arxiv.org/abs/math/0211159
        
             | auggierose wrote:
             | I think there is a misunderstanding here. Does arXiv count
             | as a publication? Yes, pretty much anything that gives you
             | a DOI does, for example Zenodo. Does it function as a
             | reputable anything? No.
             | 
             | The paper you link to counts as a publication, but its
             | reputation stands on its own, it has nothing to do with
             | arXiv as a venue. Ideally, that's how it is for all papers,
             | but it isn't, just by publishing in certain venues your
             | paper automatically gets a certain amount of reputation
             | depending on the venue.
        
               | fc417fc802 wrote:
               | > Ideally, that's how it is for all papers, but it isn't
               | 
               | We require a method of filtering such that a given
               | researcher doesn't have to personally vet _in
               | excruciating detail_ every paper he comes across because
               | there simply isn 't enough time in the day for that.
               | 
               | Ideally such a system would individually for each paper
               | provide a multi-dimensional score that was reputable. How
               | can those be calculated in a manner such that they're
               | reputable? Who knows; that exercise is left for the
               | reader.
               | 
               | In practice "well it got published in Nature" makes for a
               | pretty decent spam filter followed by metrics such as how
               | many times it's been cited since publication, checking
               | that the people citing it are independent authors who
               | actually built directly on top of the work, and checking
               | how many of such citing authors are from a different
               | field.
        
               | mitthrowaway2 wrote:
               | Can't we do better than that?
               | 
               | PageRank was a decent solution for websites. Can't we
               | treat citations as a graph, calculate per-author and per-
               | paper trustworthiness scores, update when a paper gets
               | retracted, and mix in a dash of HN-style community
               | upvotes/downvotes and openly-viewable commentary and Q&A
               | by a community of experts and nonexperts alike?
        
               | auggierose wrote:
               | You know that is what PageRank was originally for, right?
        
               | mitthrowaway2 wrote:
               | Sure. In that case I guess I'm just waiting for a couple
               | of college kids in a garage to start a website that
               | actually uses it for its intended purpose, so that we can
               | finally deprecate PrestigiousPrivateJournalRank.
        
               | fc417fc802 wrote:
               | Of course we could! My tongue in cheek "exercise is left
               | for the reader" comment was meant to convey that it's
               | deceptively simple.
               | 
               | Just one example off the top of my head. How do you
               | handle negative citations? For example a reputable author
               | citing a known incorrect paper to refute it. You need
               | more metadata than we currently have available.
               | 
               | tl;dr just draw the rest of the fucking owl.
               | 
               | Upvotes, downvotes, and commentary? That's _extremely_
               | complicated. Long term data persistence? Moderation? Real
               | names? Verification of lab affiliations? Who sets the
               | rules? How do you cope with jurisdictional boundaries and
               | related censorship requirements? The scientific
               | literature is fundamentally an open and above all
               | international collaboration. Any sort of closed,
               | centralized, or proprietary implementation is likely to
               | be a nonstarter.
               | 
               | Thus if your goal is a universal system then I'm fairly
               | certain you need to solve the decentralized social
               | networking problem as a more or less hard prerequisite to
               | solving the decentralized scientific literature review
               | problem. This is because you need to solve all the same
               | problems but now with a much higher standard for data
               | retention and replication.
               | 
               | Very topically I assume you'd need a federated protocol.
               | It would need to be formally standardized. It would need
               | a good story for data replication and archival which
               | pretty much rules out ActivityPub and ATProto as they
               | currently stand so you're back to the drawing board.
               | 
               | A nontrivial part of the above likely involves also
               | solving the decentralized petname system problem that GNS
               | attempts to address.
               | 
               | I think a fully generalized scoring or ranking system is
               | exceedingly unlikely to be a realistic undertaking.
               | There's no problem with isolated private venues (ie
               | journals) we just need to rethink how they work. Services
               | such as arxiv provide a DOI so there's nothing stopping
               | "journals" that are actually nothing more than
               | lightweight review platforms that don't actually host any
               | papers themselves from being built.
        
               | auggierose wrote:
               | > Upvotes, downvotes, and commentary? That's extremely
               | complicated.
               | 
               | No, it is not. Don't throw the baby out with the bath
               | water. Zenodo is centralized, and that is fine. A system
               | hosted by CERN would be universal enough for most
               | purposes.
               | 
               | The truth is, most papers cannot stand on their own, they
               | need a reputable venue. While it is difficult to get into
               | Nature, it is much more difficult to actually contribute
               | something substantial to science. That's why we don't
               | have a system like that.
        
               | fc417fc802 wrote:
               | I think you've misunderstood me. Did you read my final
               | paragraph? I was agreeing with what you wrote there -
               | that simply rethinking how centralized journals operate
               | could accomplish the majority of the goal while
               | sidestepping most of the complexity.
               | 
               | That said, I disagree that papers require a centralized
               | venue in any fundamental sense. They _currently_ need
               | such a venue because we don 't have a better process for
               | vetting and filtering them at scale. The issue is that
               | decentralizing such a process in an acceptable manner is
               | a monstrously complicated prospect.
        
               | auggierose wrote:
               | > We require a method of filtering such that a given
               | researcher doesn't have to personally vet in excruciating
               | detail every paper he comes across because there simply
               | isn't enough time in the day for that.
               | 
               | We do require such a method. Isn't that what AI is for?
               | Strictly working as a filter. You still need to
               | personally vet in excruciating detail every paper you
               | rely on for your work.
        
               | fc417fc802 wrote:
               | Maybe. I think that's still experimental and far too
               | resource intensive to do on an individual basis. However
               | an intensive LLM review performed by a centralized
               | service once per paper as a sort of independent
               | literature watchdog would likely be of value. I haven't
               | heard of such a thing yet though.
        
             | light_hue_1 wrote:
             | It was not awarded because that paper is on arxiv. That
             | paper could have been printed and sent out by mail. Or
             | posted on 4chan. etc. It just so happens to be it was on
             | arxiv which made no difference to anything.
        
         | queuebert wrote:
         | > Unfortunately, over the years, arXiv has become something
         | like a "venue" in its own right, ...
         | 
         | In my experience as a publishing scientist, this is partly
         | because publishing with "reputable" journals is an increasingly
         | onerous process, with exorbitant fees, enshittified UIs, and
         | useless reviews. The alternative is to upload to arXiv and move
         | on with your life.
        
           | groundzeros2015 wrote:
           | That's true. But that's separate than the use in ML in
           | Blockchain circles as a form of a marketing - using academic
           | appearances.
        
             | jjk166 wrote:
             | That sounds more like an issue of certain fields having
             | crappy standards because the people in those fields benefit
             | from crappy standards than an issue with the site they
             | happen to host papers on.
        
               | groundzeros2015 wrote:
               | I don't buy "some fields are just more honorable".
               | Everyone uses publishing for personal gain.
               | 
               | But yes it's a people problem, not an arxiv problem.
        
             | StableAlkyne wrote:
             | Every field and every publisher has this issue though.
             | 
             | I've read papers in the chemical literature that were
             | clearly thinly veiled case studies for whatever instrument
             | or software the authors were selling. Hell, I've read
             | papers that had interesting results, only to dig into the
             | math and find something fundamentally wrong. The worst was
             | an incorrect CFD equation that I traced through a telephone
             | game of 4 papers only to find something to the effect of
             | "We speculate adding $term may improve accuracy, but we
             | have not extensively tested this"
             | 
             | Just because something passed peer review does not make it
             | a good paper. It just means somebody* looked at it and
             | didn't find any obvious problems.
             | 
             | If you are engaged in research, or in a position where
             | you're using the scientific literature, it is vital that
             | you read every paper with a critical lens. Contrary to
             | popular belief, the literature isn't a stone tablet sent
             | from God. It's messy and filled with contradictory ideas.
             | 
             | *Usually it's actually one of their grad students
        
               | groundzeros2015 wrote:
               | I completely agree. Sophisticated marketing campaigns
               | include academic literature to bikini clad women.
        
         | Aurornis wrote:
         | > and with just enough moderation to not devolve into spam and
         | chaos
         | 
         | arXiv has become a target for grifters in other domains like
         | health and supplements. I've seen several small scale health
         | influencers who ChatGPT some "papers" and then upload them to
         | arXiv, then cite arXiv as proof of their "published research".
         | It's not fooling anyone who knows how research work but it's
         | very convincing to an average person who thinks that that
         | they're doing the right thing when they follow sources that
         | have done academic research.
         | 
         | I've been surprised as how bad and obviously grifty some of the
         | documents I've seen on arXiv have become lately. Is there any
         | moderation, or is it a free for all as long as you can get an
         | invite?
        
         | aimarketintel wrote:
         | This is great news for anyone building tools on top of arXiv
         | data. The API (export.arxiv.org/api/) is one of the best free
         | academic data sources -- structured Atom feed with full
         | abstracts, authors, categories, and publication dates.
         | 
         | I've been using it as one of 9 data sources in a market
         | research tool -- arXiv papers are a strong leading indicator of
         | where an industry is heading. Academic research today often
         | becomes commercial products in 2-3 years.
        
         | PaulHoule wrote:
         | Review papers are interesting.
         | 
         | Bibliometrics reveal that they are highly cited. Internal data
         | we had at arXiv 20 years ago show they are highly read. Reading
         | review papers is a big part of the way you go from a civilian
         | to an expert with a PhD.
         | 
         | On the other hand, they fall through the cracks of the normal
         | methods of academic evaluation.
         | 
         | They create a lot of value for people but they are not likely
         | to advance your career that much as an academic, certainly not
         | in proportion to the value they create, or at least the value
         | they used to create.
         | 
         | One of the most fun things I did on the way to a PhD was
         | writing a literature review on giant magnetoresistance for the
         | experimentalist on my thesis committee. I went from knowing
         | hardly anything about the topic to writing a summary that
         | taught him a lot he didn't know. Given any random topic in any
         | field you could task me with writing a review paper and I could
         | go out and do a literature search and write up a summary. An
         | expert would probably get some details right that I'd get
         | wrong, might have some insights I'd miss, but it's actually a
         | great job for a beginner, it will teach you the field much more
         | effectively than reading a review paper!
         | 
         | How you regulate review papers is pretty tricky. If it is
         | original research the criterion of "is it original research" is
         | an important limit. There might already be 25 review papers on
         | a topic, but maybe I think they all suck (they might) and I can
         | write the 26th and explain it to people the way I wish it was
         | explained to me.
         | 
         | Now you might say in the arXiv age there was not a limit on
         | pages, but LLMs really do problematize things because they are
         | pretty good at summarization. Send one off on the mission to
         | write a review paper and in some ways they will do better than
         | I do, in other ways will do worse. Plenty of people have no
         | taste or sense of quality and they are going to miss the latter
         | -- hypothetically people could do better as a centaur but I
         | think usually they don't because of that.
         | 
         | One could make the case that LLMs make review papers obsolete
         | since you can always ask one to write a review for you or just
         | have conversations about the literature with them. I know I
         | could have spend a very long time studying the literature on
         | Heart Rate Variability and eventually made up my mind about
         | which of the 20 or so metrics I want to build into my
         | application and I did look at some review papers and can
         | highlight sentences that support my decisions but I made those
         | decisions based on a few weekends of experiments and talking to
         | LLMs. The funny thing is that if you went to a conference and
         | met the guy who wrote the review paper and gave them the hard
         | question of "I can only display one on my consumer-facing HRV
         | app, which one do I show?" they would give you that clear
         | answer that isn't in the review paper and maybe the odds are
         | 70-80% that it will be my answer.
        
           | jballanc wrote:
           | I exited academia for industry 15 years ago, and since then I
           | haven't had nearly as much time to read review papers as I
           | would like. For that reason, my view may be a bit outdated,
           | but one thing I remember finding incredibly useful about
           | review papers is that they provided a venue for speculation.
           | 
           | In the typical "experimental report" sort of paper, the focus
           | is typically narrowed to a knifes edge around the hypothesis,
           | the methods, the results, and analysis. Yes, there is the
           | "Introduction" and a "Discussion", but increasingly I saw
           | "Introductions" become a venue to do citation bartering (I'll
           | cite your paper in the intro to my next paper if you cite
           | that paper in the intro to your next paper) and "Discussion"
           | turn into a place to float your next grant proposal before
           | formal scoring.
           | 
           | Review papers, on the other hand, were more open to
           | speculation. I remember reading a number that were framed as
           | "here's what has been reported, here's what that likely
           | means...and here's where I think the field could push forward
           | in meaningful ways". Since the veracity of a review is
           | generally judged on how well it covers and summarizes what's
           | already been reported, and since no one is getting their next
           | grant from a review, there's more space for the author to
           | bring in their own thoughts and opinions.
           | 
           | I agree that LLMs have largely removed the need for review
           | papers as a reference for the current state of a field...but
           | I'll miss the forward-looking speculation.
           | 
           | Science is staring down the barrel of a looming crisis that
           | looks like an echo chamber of epic proportions, and the only
           | way out is to figure out how to motivate reporting negative
           | results and sharing speculative outsider thinking.
        
             | PaulHoule wrote:
             | My feelings about that outsider thing are pretty mixed.
             | 
             | On one hand I'm the person who implemented the endorsement
             | system for arXiv. I also got a PhD in physics did a postdoc
             | in physics then left the field. I can't say that I was
             | mistreated, but I saw one of the stars of the field today
             | crying every night when he was a postdoc because he was so
             | dedicated to his work and the job market was so brutal --
             | so I can say it really hurts when I see something that I
             | think belittles that.
             | 
             | On the other hand I am very much an interested outsider
             | when it comes to biosignals, space ISRU, climate change,
             | synthetic biology and all sorts of things. With my startup
             | and hackathon experience it is routine for me to go look at
             | a lot of literature in a new field and cook it down and
             | realize things are a lot simpler than they look and build a
             | demo that knocks the socks off the postdocs because...
             | that's what I do.
             | 
             | But Riemann Hypothesis, Collatz, dropping names of anyone
             | who wrote a popular book, I don't do that. What drives me
             | nuts about crackpots is that they are all interested in the
             | same things whereas real scientists are interested in
             | something different. [1] It was a big part of our thinking
             | about arXiv -- crackpot submissions were a tiny fraction of
             | submission to arXiv but they would have been half the
             | submissions to certain fields like quantum gravity.
             | 
             | I've sat around campfires where hippies were passing a
             | spliff around and talking about that kind of stuff and was
             | really amused recently when we found out that Epstein did
             | the thing with professors who would have known better -- I
             | mean, I will use my seduction toolbox to get people like
             | that to say more than they should but not to have the same
             | conversation I could have at a music festival.
             | 
             | [1] e.g. I think Tolstoy got it backwards!
        
               | aleph_minus_one wrote:
               | > crackpot submissions were a tiny fraction of submission
               | to arXiv but they would have been half the submissions to
               | certain fields like quantum gravity
               | 
               | Just some very outsider thought:
               | 
               | Could it be that this problem is rather self-inflected by
               | researchers and their marketing?
               | 
               | Physicists market all the time that resolving these
               | questions about quantum gravity will give the answers to
               | the deepest questions that plagued philosophers over
               | millenia. Well, such a marketing attracts crackpots who
               | do believe that they have something to tell about such
               | topics.
               | 
               | Relatedly, to improve their chances of getting research
               | funding, a lot of researchers do an outreach to the
               | general public to show the importance of the questions
               | that they work on. Of course this means that people from
               | the general pyblic who now get interested in such
               | questions will make their own attempt to make a
               | contribution because - well, this researcher just told me
               | how important it is to think about such questions. Of
               | course such a person from the general public typically
               | does not have the deep scientific knowledge such that
               | their contribution meets the high scientific standards.
        
         | abdullahkhalids wrote:
         | > Unfortunately, over the years, arXiv has become something
         | like a "venue" in its own right, particularly in ML, with some
         | decently cited papers never formally published and "preprints"
         | being cited left and right.
         | 
         | This has been a common practice in physics, especially the more
         | theoretical branches, since the inception of arXiv. Senior
         | researchers write a paper draft, and then send copies to some
         | of their peers, get and incorporate feedback, and just submit
         | to arxiv.
        
           | godelski wrote:
           | And this is really how it should be. Honestly the only thing
           | I want arxiv to do is become more like open review. Allow
           | comments by peers and some better linking to data and project
           | pages.
           | 
           | It works for physics because physicists are very rigorous. So
           | papers don't change very much. It also works for ML because
           | everyone is moving very fast that it's closer to doing open
           | research. Sloppier, but as long as the readers are other
           | experts then it's generally fine.
           | 
           | I think research should really just be open. It helps
           | everyone. The AI slop and mass publishing is exploiting our
           | laziness; evaluating people on quantity rather than quality.
           | I'm not sure why people are so resistant to making this
           | change. Yes, it's harder, but it has a lot of benefits. And
           | at the end of the day it doesn't matter if a paper is
           | generated if it's actually a quality paper (not in just how
           | it reads, but the actual research). Slop is slop and we
           | shouldn't want slop regardless. But if we evaluate on quality
           | and everything is open it becomes much easier to figure out
           | who is producing slop, collision rings, plagiarist rings, and
           | all that. A little extra work for a lot of benefits. But we
           | seem to be willing to put in a lot of work to avoid doing
           | more work
        
             | abdullahkhalids wrote:
             | I don't agree actually that is how it should or can work
             | for everyone. Senior researchers produce good quality
             | research, and they have a network of high quality peers
             | built over decades. Both those are necessary for them to
             | reach out and ask for feedback, and get genuine and high
             | quality feedback.
             | 
             | Junior researchers don't have these typically. They also
             | benefit more from anonymous feedback, which enables the
             | reviewers to bluntly identify wrong or close to wrong
             | results. So I think open journals should continue to exist.
             | They fill an essential role in the scientific ecosystem.
        
               | godelski wrote:
               | Mostly I'm fine with journals and conferences but I think
               | it's the prestige that has fucked everything over.
               | 
               | I want reviews of my papers! But I want reviews by people
               | who care. I don't want reviews by people who don't want
               | to review. I don't want reviews by people who think it's
               | their job to reject or find flaws in the work. I want
               | reviews by people who care. I want reviews by people who
               | want to make my work better. I want reviews by people who
               | understand all works are flawed and we can't tackle every
               | one in every paper (the problem isn't solved, so there's
               | always more!).
               | 
               | So low bars. Forget the prestige, citation count,
               | novelty, and all the bullshit and just focus on the
               | actual work and that the act of publishing is about
               | communicating. Publishing is the main difference between
               | private and public labs. Private labs do fine research,
               | without all the formal review. It's just that nobody
               | learns about it. They don't give back to the community.
               | 
               | So my ideal system still has reviewers, journals, and
               | conferences but I think we'd get along just fine without
               | them. I believe that if we can't recognize that then we
               | can't use these other tools to make things better.
               | 
               | They aren't fundamental tools needed to make the process
               | work, they're tools that _can_ make the process work
               | better. But I 'm not convinced they're doing a good job
               | of that right now.
        
             | lokar wrote:
             | You could imagine separating the "publishing" part, which
             | really should just be open with minimal anti-spam etc, from
             | the "this was reviewed by a trusted group of people so you
             | should give it more consideration" part. You could do the
             | second without it being attached to the publishing.
        
               | godelski wrote:
               | I think your phrasing was good. A lot of people conflate
               | a work being published is equivalent to peer reviewed and
               | that "peer reviewed" means "correct".
               | 
               | I think when you think about publishing as what it
               | actually is, researchers communicating to researchers,
               | what I said makes much more sense. I do think formal
               | review does help reduce slop but I think anyone who has
               | published anything is also very aware of how noisy the
               | system is and how good works get rejected or delayed
               | because they aren't "novel" enough.
               | 
               | Honestly, my ideal system is journals with low bars. We
               | forget this prestige bullshit and silliness of novelty
               | (often it's novel to niche experts but not to others) and
               | basically check if it looks like due diligence was done,
               | there's not things obviously wrong, no obvious
               | plagiarism, and then maybe a little back and forth to
               | help communicate. But I think we've gotten too lost in
               | this idea of needing to punish fast and that it has to be
               | important. Important to who? Tons of stuff is only
               | considered important later, we've got a long track record
               | of not being so great at that. But we have a long track
               | record of at least some people working on what we later
               | find out is important.
        
               | nickpsecurity wrote:
               | There's a lot of stuff with basic errors in peer reviewed
               | journals. Things also can get rejected for anything from
               | formatting to politics.
               | 
               | I like Arxiv better. I get the paper, know it's probably
               | not reviewed (like in many journals), and review it if I
               | want to. I used to ise Citeseerx, too, to get tons of
               | CompSci papers. Even better, OpenReview might have some
               | good observations.
        
         | fsckboy wrote:
         | > _We 've seen the kind of appeasement prose from their
         | statement and FAQ [1] countless times before_
         | 
         | what are you referring to, who is being appeased who shouldn't
         | be? what are you worried about happening?
        
       | asimpleusecase wrote:
       | I wonder if there are plans to licence the content for AI
       | training
        
         | KellyCriterion wrote:
         | Id guess OAI & co have already copied without asking?
        
           | mkl wrote:
           | No need to ask - the whole point is open access.
           | https://info.arxiv.org/help/bulk_data.html
        
         | mkl wrote:
         | It's been available all along:
         | https://info.arxiv.org/help/bulk_data.html
        
       | shevy-java wrote:
       | "Recently arXiv's growth has accelerated. Since 2022, it has
       | expanded its staff to 27, in large part to deal with a 50%
       | increase in submitted manuscripts."
       | 
       | I am wary of that. IMO the business model is damaged therein. You
       | can say in 2022 we had 27; bankrupt in 2030.
        
       | Aerolfos wrote:
       | And they hired a LinkedIn business idiot to run the new
       | organization - so the aim is for an infinite growth tech startup
       | in terms of governance, despite the technical legal status of
       | non-profit. It shows in the language they use in the
       | announcement, too ("improved financial viability in the long
       | run")
       | 
       | OpenAI shows exactly how well that works and what that kind of
       | governance does to a company and to its support of science and
       | the commons.
       | 
       | TL;DR, it's fucked.
        
       | swiftcoder wrote:
       | > raised concerns about the proposed $300,000 salary for arXiv's
       | new CEO, saying it seemed high
       | 
       | Is a mid-to-high engineering salary outlandish for a CEO of what
       | is likely to be a fairly major non-profit? Even non-profits have
       | to be somewhat competitive when it comes to salary, and the ideal
       | candidate is likely someone who would be balancing this against a
       | tenured position at a major university
        
         | mort96 wrote:
         | Salaries in the US are so bonkers. Everywhere else outside of
         | the US, $300,000 is an outlandish high salary. To call it "mid
         | to high" is insane.
        
           | HappyPanacea wrote:
           | Yes the obvious play is to move human labor to cheaper
           | countries like France (including CEO of course).
        
             | renewiltord wrote:
             | The reason the French can't build these things is the same
             | reason they shouldn't be allowed to be in charge. It's a
             | preprint PDF host. Just make your own if you can run this
             | one.
        
               | magnio wrote:
               | They do have their own: https://hal.science/
               | 
               | It is actually quite common to come across HAL in
               | subfields of mathematics in my experience.
        
               | bjourne wrote:
               | HAL is decidedly second-tier. Given the option, everyone
               | would pick arXiv over HAL. Hence, HAL hosts lots of stuff
               | that didn't (even) make it to arXiv => lots of subpar
               | dredge.
        
               | Miraltar wrote:
               | > HAL is decidedly second-tier. Given the option,
               | everyone would pick arXiv over HAL.
               | 
               | Can you elaborate on that?
        
               | linhns wrote:
               | I agree that dredge is a huge problem with HAL, but it's
               | getting better. While arXiv is still stuck with a
               | unfriendly UI.
        
               | renewiltord wrote:
               | That's great. People will use whichever one is better.
        
               | swiftcoder wrote:
               | Turns out that "better" for many people means "better
               | moderated", since static hosting is hard to
               | differentiate. And at present Arxiv is winning that one
               | (at the expense of considerably higher running costs due
               | to said moderation)
        
             | 0x3f wrote:
             | The net salary in France might be low but the overall cost
             | of hiring is quite high. Besides, why go to the middle when
             | you can just find even cheaper places, if that's your prime
             | metric?
        
           | swiftcoder wrote:
           | Even in the states, it's more a distortion caused by the big
           | tech centres. A software engineer in Ohio doesn't command
           | that kind of salary, but in San Francisco or Seattle that'll
           | buy you a moderately-senior engineer.
           | 
           | And while academic salaries are generally not great, tenured
           | professors at big universities tend to make a fair bit (plus
           | a lot more vacation time and perks than is normal in the US)
        
             | philipallstar wrote:
             | It's also caused by progressive tax rates. People take
             | harder jobs based on net wage, not gross wage, so gross
             | wage has to compensate.
        
             | justin66 wrote:
             | > A software engineer in Ohio doesn't command that kind of
             | salary, but in San Francisco or Seattle that'll buy you a
             | moderately-senior engineer.
             | 
             | On the other hand, a CEO of a well-known nonprofit might
             | command that kind of salary in Ohio. People often
             | underestimate how much the leaders of nonprofits pay
             | themselves.
        
               | supern0va wrote:
               | I'm not entirely convinced that this is entirely some
               | sort of widespread bad behavior. Many non-profit boards
               | conduct research on salaries and essentially size their
               | organization and pay something akin to a market rate for
               | the given size and scope.
               | 
               | However, even a small percentage of bad actors finding a
               | way to inflate their salaries will, as a side effect,
               | inflate salaries across the board because it influences
               | the process that sets the salaries for the honest
               | organizations.
               | 
               | It's a fun problem.
        
               | justin66 wrote:
               | I suspect abuse is more prevalent at the low end, among
               | nonprofits that don't do much.
               | 
               | I stand by the point of my original post: _People often
               | underestimate how much the leaders of nonprofits pay
               | themselves._ These are figures you can look up and quiz
               | your friends to test the hypothesis, if they're into that
               | sort of thing. For a good time include some nonprofit
               | hospitals.
        
               | supern0va wrote:
               | Outside of manipulating the board, they do not pay
               | themselves, though. The board decides their comp package.
        
               | justin66 wrote:
               | That's fair, but the boards of nonprofits are as
               | corruptible (I'm reluctant to use that word since we're
               | talking about fairly standard practices, not outright
               | crime, but whatever) as those in the corporate world. But
               | I wouldn't want to keep talking about this situation as
               | if it's all theoretical. In contrast with a lot of the
               | corporate world, with nonprofits you can just go and look
               | at what their officers are paid (it's public record) and
               | decide for yourself what you feel about the figures.
        
           | dev_l1x_be wrote:
           | So is the living cost. Insurance, housing, etc. A better
           | comparison is PPP.
        
             | carlosjobim wrote:
             | Living costs are similarly high in many places that have
             | nowhere near the salaries of the US.
             | 
             | It's still the land of opportunities. It's easier to find
             | ways to reduce your living costs than ways to increase your
             | salary.
        
           | 0x3f wrote:
           | Not everywhere. Switzerland exists. Also cost of living is a
           | thing so if anything US/CH just ramp up to match that. The
           | rest of Europe has high CoL but terrible salaries. Asia has
           | bad salaries but low CoL (on average).
        
             | mort96 wrote:
             | According to swissdevjobs.ch[1], the top 10% salary for a
             | senior software developer in Switzerland is 135,000 swiss
             | franc; that's roughly $170,000 per year.
             | 
             | So if this is correct, then even in Switzerland, it seems
             | like $300,000 per year would be an obscenely high salary
             | for a senior developer.
             | 
             | [1]: https://swissdevjobs.ch/salaries/all/all/Senior
        
               | 0x3f wrote:
               | Well first of all it's a CEO position, not an SWE :)
               | 
               | Even if we scope it to SWE, I don't think that's far off
               | the US percentiles.
               | 
               | In London I imagine the top 10% SWE is not even 100k GBP.
               | In Germany even worse.
        
               | mort96 wrote:
               | I responded to the idea that $300,000/year is a "mid-to-
               | high engineering salary". CEO salaries are absurdly high
               | everywhere.
        
               | 0x3f wrote:
               | Oh right, well it depends on CoL doesn't it? You can
               | reframe European salaries as 'obscene' by world standards
               | too. Both the US and Europe have totally broken and
               | unaffordable housing markets, for example, but at least
               | the Bay Area compensates with salary. I would say that
               | relative to costs it's more that other salaries are
               | obscenely low, if anything. People in Europe should be
               | rioting, but unfortunately only the home owners are
               | politically active.
        
               | mort96 wrote:
               | Does cities like San Francisco not have janitors?
               | Waiters? Food delivery drivers? Or do those jobs command
               | a six-figure salary too? If they can live comfortably in
               | the city on a five-figure salary, maybe the argument that
               | "cost of living is so high in SF that you can't live
               | without a $300,000/year salary" is just a little bit
               | overblown?
               | 
               | I can not imagine what one could possibly need $300,000
               | per year for unless an apartment costs like $200,000 per
               | year.
        
               | 0x3f wrote:
               | You get by on a low salary by living with multiple people
               | in the same apartment. Or you live far away and commute.
               | Or both.
               | 
               | Not really a tenable long-term situation for a senior
               | employee with plans to start a family. Family homes of
               | decent size and area are literally millions of dollars.
        
               | mort96 wrote:
               | I guess I don't understand why programmers somehow
               | deserve a better life than other people. Janitors deserve
               | to start families too, don't they?
        
               | 0x3f wrote:
               | It's not about deserving, programmers just have enough
               | market power to be able to choose to go elsewhere.
               | Janitors and other more fungible employees do not.
               | 
               | Besides, I did already say that everyone else was
               | underpaid relative to costs. But that's not unique to the
               | Bay Area. Cost of housing relative to income is terrible
               | in almost all of the major European cities too.
               | 
               | Once cities become wealthy enough to develop a home
               | owning class, they seem to cease being able to provision
               | adequate housing supply in general.
        
               | throw-the-towel wrote:
               | Usually this kind of argument leads to _punishing the
               | programmers_ , not lifting up the janitors.
        
               | mort96 wrote:
               | That's kind of two sides of the same coin, isn't it? The
               | cost of living is so high in part _because_ so many have
               | ridiculously high salaries, isn 't it?
        
               | swiftcoder wrote:
               | > The cost of living is so high in part because so many
               | have ridiculously high salaries
               | 
               | Bigger problem in the SF area is that a bunch of folks
               | who owned property before the gold rush have ended up
               | real-estate-rich, and formed a voting block that actively
               | prevents the construction of new housing (on the basis
               | that it might devalue their accidental real estate
               | investment)
        
               | prepend wrote:
               | Its about how the market values those skillsets, not
               | about what people "deserve."
               | 
               | No one is sitting around and setting salaries based on
               | the intrinsic human dignity of the people working jobs.
        
               | throw-the-towel wrote:
               | > I can not imagine what one could possibly need $300,000
               | per year for unless an apartment costs like $200,000 per
               | year.
               | 
               | Being able to afford unpredictable expenses and not have
               | it bankrupt you. In the US, that would include
               | healthcare. Everywhere in the world, that would be useful
               | if you were laid off.
        
               | mort96 wrote:
               | To build an emergency fund, you just need an income
               | that's a bit higher than your expenses. If you earn
               | $60,000 after tax per year, and spend $50,000 per year,
               | you have a decent $10,000 emergency fund after one year
               | and a massive $100,000 emergency fund after a decade. You
               | don't need $300,000 per year to save.
        
               | swiftcoder wrote:
               | > Does cities like San Francisco not have janitors?
               | Waiters?
               | 
               | When I used to visit the Meta campus in Menlo Park, the
               | QA folk I worked with were commuting 2 hours each way
               | just to be able to afford housing. I've no idea how far
               | away the janitorial staff must have lived to do the same
        
               | jalla wrote:
               | I worked at Redwood Shores. On a walk across the 101, I
               | discovered where the cleaning staff and food workers
               | lived. In cars, under the bridge or parked in a quiet
               | corner of the street next to industrial or commercial
               | property.
        
               | swiftcoder wrote:
               | > Oh right, well it depends on CoL doesn't it?
               | 
               | To some extent, maybe, but often not. For example, London
               | has similar cost of living to the Bay Area, and when I
               | was at Meta experienced folks like Dan Abramov over in
               | London were making about the same as fresh college hires
               | in Menlo Park...
        
               | 0x3f wrote:
               | Yeah I was talking more about the definition of obscene.
               | Like is it obscene to make 300k if housing is so
               | expensive? I say no, and that London salaries are just
               | bad. Although it would be preferable to fix the housing
               | market.
               | 
               | To be fair though, Dan specifically is kind of notorious
               | for messing up his comp negotiation. Did you not see the
               | Twitter pile on at the time?
        
               | swiftcoder wrote:
               | > Dan specifically is kind of notorious for messing up
               | his comp negotiation
               | 
               | Indeed, but having seen the infamous spreadsheet, he
               | didn't have all _that_ much headroom (unless he agreed to
               | move to the US)
        
           | groundzeros2015 wrote:
           | Note that you are seeing an explicit tradeoff of different
           | economic systems.
        
           | ZpJuUuNaQ5 wrote:
           | >Salaries in the US are so bonkers.
           | 
           | Sure, but the cost of living there is significantly higher as
           | well. Anyway, I can hardly even comprehend these kinds of
           | sums, though I am a bit of an outlier, as I earn around
           | $27,700 as an SWE in Europe, which is low even by the
           | standards of companies in my own country.
        
             | nozzlegear wrote:
             | > _Sure, but the cost of living there is significantly
             | higher as well._
             | 
             | The US is huge though, and the cost of living is
             | astronomically lower outside of those big tech hub cities.
             | I live in a tiny town in the midwest with a big house and a
             | big yard that we bought for $89k USD in 2016[+]. I'm able
             | to support myself and my wife comfortably on just my (self-
             | employed) SWE salary.
             | 
             | [+] Real estate inflation index for our area says the house
             | would have cost us around $130-$150k USD in 2026.
        
           | segmondy wrote:
           | Everyone outside the US doesn't deal with USD. Your comment
           | is bonkers. Read up on purchasing power. All locations are
           | not equal.
        
             | jltsiren wrote:
             | The traditional definition of high income starts at 2x the
             | median. Looking the US as a whole, anything above $125k
             | should be considered high income. But it doesn't feel like
             | that, because median wages are unusually low in the US
             | relative to mean wages. Upper middle class salaries, on the
             | other hand, have grown very high, and they have distorted
             | people's perceptions. Even now, we are debating whether
             | almost 5x the median should be considered high income.
        
             | MattDamonSpace wrote:
             | The us has an enormous per capita gdp for that large a
             | country
        
           | ryukoposting wrote:
           | Silicon Valley is the _only_ place in the United States where
           | $300K is even close to the  "middle" of anything.
           | 
           | I just moved to SV a few months ago from the Midwest (and not
           | a particularly cheap part of it). Telling my coworkers who
           | aren't from the US what a house costs in Wisconsin, you'd
           | have thought _I_ was the one who moved from a foreign
           | country.
        
             | swiftcoder wrote:
             | > Silicon Valley is the only place in the United States
             | where $300K is even close to the "middle" of anything.
             | 
             | It does heavily cluster around SV, for sure, but
             | Seattle/NewYork/Boston/Arlington will all get you there,
             | and Chicago/Austin/etc aren't all that far behind at this
             | point
        
             | Supermancho wrote:
             | As a datapoint, I get paid just under 250k/yr and I'm an
             | above average developer in his very late career, at a
             | midwest company. 300k avg for SV is about right.
             | 
             | The local college and medical administrators are the ones
             | that own the mansions in my city. I have a family, house
             | and mortgage plus my large medical expenses (cardiac) I can
             | handle...until I cant.
        
           | snovymgodym wrote:
           | It's frankly not that crazy of a salary for an important
           | executive position.
           | 
           | The city manager of a small city in Texas gets paid around
           | that much and that's taxpayer money.
           | 
           | Now what collegiate football coaches are paid, that's pretty
           | crazy.
        
             | mort96 wrote:
             | I didn't say it's a crazy salary for an important executive
             | position, I said it's wild to call it a "mid-high
             | engineering salary"
        
           | Drupon wrote:
           | Europoors should keep quiet when talking about US tech
           | culture.
        
         | HappyPanacea wrote:
         | arXiv's CEO doesn't need to be a tenured professor equivalent
         | it is a preprint repository ffs.
        
           | 0x3f wrote:
           | It's a bit more complex than an S3 bucket though because the
           | value comes from the reputation network, which can't really
           | be replicated easily.
           | 
           | Though, saying that, I suppose all the reputation data is
           | kind of public. Apart from emails/accounts.
        
             | groundzeros2015 wrote:
             | > It's a bit more complex than an S3 bucket
             | 
             | It's even less. I would bet if it's not now, for the vast
             | majority of its life it was a machine at someone's desk at
             | Cornell.
        
               | PaulHoule wrote:
               | When I was involved it was an x86 machine in a rack in
               | Rhodes Hall.
               | 
               | I had a copy of the whole thing under my desk though in
               | Olin Library on a Pentium 3 machine from IBM that was
               | built like a piece of military hardware. In April the sun
               | would shine in the windows of my office, the HVAC system
               | was unable to cool my office, and temperatures would soar
               | above 100F and I'd be sitting there in a tank top and
               | drinking a lot of water and sports drinks and visitors
               | would ask me how I could stand it.
        
               | groundzeros2015 wrote:
               | Thanks for confirming. We need to stop marketing for AWS
               | by talking about the ability to use the internet in AWS
               | branded product terms.
        
               | 0x3f wrote:
               | The S3 API/UX/cost model is so seductively simple for
               | static hosting though. I kind of think they deserve their
               | ubiquity. Not on 90% of their products though.
        
               | PaulHoule wrote:
               | It's great for some applications, like to serve up the QR
               | codes for this system
               | 
               | https://mastodon.social/@UP8/116086491667959840
               | 
               | I could even make those cards tradeable like NFTs, use
               | DynamoDB as the ledger, and not worry about the cost at
               | all.
               | 
               | On the other hand if you are talking about something
               | bandwidth heavy forget about AWS. Video hosting with
               | Cloudfront doesn't seem that difficult, even developing a
               | YouTube clone where anybody could upload a video and it
               | gets hosted seems like a moderate sized project. But with
               | the bandwidth meter always running that kind of system
               | could put you into the poorhouse pretty quickly if it
               | caught on. Much of why YouTube doesn't have competition
               | is exactly that: Google's costs are very low _and_ they
               | have an established system of monetization.
               | 
               | I am keeping my photo albums on Behance rather than self-
               | hosting because I lost enough money on a big photo site
               | in AWS that it drove my wife furious and it took me a few
               | years to pay off the debt.
        
               | groundzeros2015 wrote:
               | > I lost enough money on a big photo site in AWS
               | 
               | I'm sorry what. This is supposed to persuade me?
        
         | Hendrikto wrote:
         | For anybody outside the SV, and especially outside the US, this
         | seems high, yes.
         | 
         | arXiv does not need to and should not optimize for "shareholder
         | value", which is at least nominally the justification for
         | outlandish CEO pay packages.
        
           | kingstnap wrote:
           | arXiv doesn't need much. All they do is host static pdfs
           | uploaded by someone else with free CDN services from Fastly
           | [0]. I'm sure they could get academics to volunteer
           | moderation services as well.
           | 
           | In reality you could host the entire thing for well under
           | $50k/year in hardware and storage if someone else is
           | providing a free CDN. Their costs could be incredibly low.
           | 
           | But just like Wikipedia I see them very likely very quickly
           | becoming a money hole that pretends to barely be kept afloat
           | from donations. All when in reality whats actually happening
           | is that its a ridiculous number of rent seekers managed to
           | ride the coattails of being the defacto preprint server for
           | AI papers to land themselves cushy Jobs at a place that
           | spends 90+% of their money on flights and hotels and wages
           | for their staff.
           | 
           | I'm already expecting their financial reports to look
           | ridiculously headcount heavy with Personnel Expenses,
           | Meetings and Travel blowing up. As well as the classic
           | Wikipedia style we spend a ton of money in unclear costs [1].
           | 
           | Whats already sad is they stopped having a real broken down
           | report that used to actually showed things. Like look at this
           | beautiful screenshot of a excel sheet. Imagine if Wikipedia
           | produced anything this clear. [2]
           | 
           | [0] https://blog.arxiv.org/2023/12/18/faster-arxiv-with-
           | fastly/
           | 
           | [1]
           | https://info.arxiv.org/about/reports/FY26_Budget_Public.pdf
           | 
           | [2]
           | https://info.arxiv.org/about/reports/2020_arXiv_Budget.pdf
        
             | OneDeuxTriSeiGo wrote:
             | > arXiv doesn't need much. All they do is host static pdfs
             | uploaded by someone else with free CDN services from Fastly
             | [0]. I'm sure they could get academics to volunteer
             | moderation services as well.
             | 
             | This just isn't true. arXiv nowadays has to deal with major
             | moderation demands due to the influx of absolute drivel,
             | spam, and slop that non-academics and less-than-quality
             | academics have been uploading to the site.
             | 
             | Moderation for arXiv isn't perfect or comprehensive but
             | they put so much work into trying to keep the worst of the
             | content off their site. At this point while they aren't
             | doing full blown peer review, they are putting a lot of
             | work into providing first pass moderation that ensures the
             | content in their academic categories is of at least some
             | level of respectable academic quality.
        
               | prepend wrote:
               | Volunteer moderators are a valid option. And I think may
               | work out better than paid employees.
        
               | OneDeuxTriSeiGo wrote:
               | volunteer moderators are a valid option however this is
               | also the way peer review works and the system is
               | unfortunately very problematic and exploitative.
               | 
               | First pass sanity checks are also a lot less fun than
               | proper peer review so paying moderators to do it is
               | probably safer in the long run or else you end up with
               | cliques of moderators who only keep moderating out of
               | spite/personal vendettas against certain groups or
               | fields.
        
             | weitendorf wrote:
             | > In reality you could host the entire thing for well under
             | $50k/year in hardware
             | 
             | I could pay Anthropic $400 to write more code than you have
             | in your entire lifetime.
             | 
             | Sure, you're able to operate a website acting as
             | essentially the most important and highest volume venue for
             | sharing academic research in the world, but come on, why
             | couldn't I just ask Claude Code or some web developer in a
             | foreign country to do the same thing?
        
           | jjk166 wrote:
           | $300k for a top executive position isn't especially high for
           | anywhere in the US. That's around what the administrative
           | director of a hospital would be making, which seems like a
           | much smaller scope than leading ArXiv. For comparison, my
           | roommate works for a non-profit that serves Philadelphia
           | whose CEO's salary is $1.1 million. The CEO of the wikimedia
           | foundation, which is similar in terms of role, has a salary
           | of $450k. General average for US CEOs including for profits
           | is around $800k and for large organizations tens of millions
           | is not atypical.
           | 
           | Non-profits aren't maximizing stock value, but they do need
           | to optimize for stakeholder value - you want to maximize the
           | amount of money being donated in and you want to make the
           | most of the donations you receive, both to advance the
           | primary mission of the non-profit and to instill confidence
           | in donors. This demands competent leadership. The idea that
           | just because something is not being done for profit means the
           | value of the person's contributions is worth less is absurd.
           | So long as the CEO provides more than $300k of value by
           | leading the organization, which might include access to their
           | personal connections, then the salary is sensible.
        
         | DonsDiscountGas wrote:
         | Considering the value and prominence of arxiv to the world,
         | this seems low to me. Although more importantly the rest of the
         | staff needs to be well paid too, and if that's the ceiling its
         | a bit concerning. It's crazy to me that people thought this was
         | too high.
        
         | prepend wrote:
         | Yes, considering the workload and responsibility of the
         | position.
         | 
         | Non-profits run into the problem of creating cushy jobs that
         | just burn doner money.
         | 
         | Arxiv is basically a giant folder in the cloud and shouldnt
         | have such high paying jobs. At least not if they want rational
         | people to keep donating.
        
       | bonoboTP wrote:
       | I fear their Mozilla-ification and Wikipedia-ification. Scope
       | creep, various outreach feel-good programs, ballooning costs,
       | lost focus etc. And other types of enshittification.
       | 
       | Any change to the basic premise will be a negative step.
       | 
       | They should just be boring quiet unopininionated neutral
       | background infrastructure.
        
         | kergonath wrote:
         | > They should just be quiet unopininionated neutral background
         | infrastructure.
         | 
         | Exactly. It should be a utility. Not quite dumb pipe, but not
         | too far either.
        
           | doctorwho42 wrote:
           | We don't do 'utility' in America. Everything has S.V. brain
           | rot - it's mixed with wall street brain rot, and now if you
           | aren't extracting wealth out of what you have access to - you
           | are failing.
        
             | musicale wrote:
             | I mean... someone needs to "unlock value" from ArXiv,
             | right?
        
         | Hendrikto wrote:
         | > Mozilla-ification
         | 
         | All the Mozilla executives have done for the last 15+ years is
         | 
         | * lay off developers
         | 
         | * spend lots of money on stupid side projects nobody asked for
         | or wants
         | 
         | * increase their own salaries
         | 
         | and all that with the backdrop of falling quality, market
         | share, and relevance.
         | 
         | I would happily donate to Firefox, but this fucked up
         | organization will never see a single cent from me. They will
         | spend it on anything but Firefox, which is the only thing
         | anybody wants them to spend it on.
         | 
         | It might already be too late, and we will be left with a
         | browser monopoly.
        
           | bonoboTP wrote:
           | And it is a risk for Arxiv too that once they start to drink
           | the koolaid and start going to the same cocktail parties that
           | these kinds of nonprofit board members and execs go to and
           | will feel the need to prance around with some fancy stuff.
           | 
           | "oh no, you see we are not a preprint server host anymore,
           | our mission is a values driven blablabla to make a meaningful
           | change in the blablabla, we have spent X dollars to promote
           | the blablabla, take me seriously please I'm also fancy like
           | you! "
        
             | musicale wrote:
             | Well, maybe they don't need to be a nonprofit. How about a
             | public benefit corporation?
             | 
             | And maybe that public benefit thing, well we don't really
             | need it do we? Now that we're deep into AI you know.
             | 
             | For-profit has a nice ring to it. We're delivering value to
             | founders and shareholders, where it belongs.
        
           | swed420 wrote:
           | > It might already be too late, and we will be left with a
           | browser monopoly.
           | 
           | Ladybird continues to have the appearance of making progress,
           | fwiw:
           | 
           | https://ladybird.org/newsletter/2026-02-28/
        
           | cge wrote:
           | >They will spend it on anything but Firefox, which is the
           | only thing anybody wants them to spend it on.
           | 
           | Mozilla certainly won't spend it on Firefox, because the
           | structure of the organization legally prohibits them from
           | spending any of their donation money on Firefox. The 'side
           | projects' are, at least officially, the real purpose of
           | Mozilla.
        
             | bonoboTP wrote:
             | They built the brand on Firefox then did a bait and switch.
             | How many people who donate to Mozilla know that it's not
             | helping Firefox?
             | 
             | But yeah, this is just how it works. Things can't stay good
             | for too long. One must always be on the lookout for the new
             | small thing that's not yet corrupted. Stay with it for a
             | while until it rots, then jump to the next replacement.
        
           | musicale wrote:
           | > They will spend it on anything but Firefox, which is the
           | only thing anybody wants them to spend it on.
           | 
           | ;_;
        
         | musicale wrote:
         | My prediction exactly.
         | 
         | Maybe a bloated foundation (pursuing expensive objectives
         | completely unrelated to ArXiv's core mission of hosting PDFs),
         | new classes of unnecessary management staff, new and useless
         | paid features that nobody wants, and obnoxious nag banners
         | claiming "ArXiv is not for sale!" but demanding money anyway.
        
       | ACCount37 wrote:
       | Frankly, the only beef I have with arXiv as is: its insistence on
       | blocking AI access.
       | 
       | I had to tell my AI to set up an MCP for "fetch while bypassing
       | arXiv's rate limit" so that it doesn't burn 40k tokens looking
       | for workarounds every time it wants to look at a paper and gets
       | hit with a "sorry, meatbags only" wall.
       | 
       | Very annoying, given how relevant arXiv papers are for ML
       | specifically, and how many of papers there are. Can't "human
       | flesh search" through all of them to pick the relevant ones for
       | your work, and they just had to insist on making it harder for
       | AIs to do it too.
        
         | spiralcoaster wrote:
         | I hope they ramp up their blocking of AI access. The last thing
         | we need is providers like this getting hammered by AI
        
       | vedantxn wrote:
       | we got this before gta 6
        
       | contubernio wrote:
       | What is worrisome about this development, and corollary actions
       | like the hiring of a CEO with a $300,000/year salary, is that the
       | essentially independent and community based platform will
       | disappear. The ArXiv exists because mathematicians and
       | physicists, and later computer scientists and engineers, posted
       | there, freely, their work, with minimal attention to licensing
       | and other commercial aspects. It has thrived because it required
       | no peer review and made interesting things accessible quickly to
       | whomever cared to read them.
       | 
       | A setup as a US-based "non-profit" is worrisome, if only because
       | 300K is an obscene salary even in a for-profit setting. That the
       | US-based posters can't see this is evidence of the basic problem
       | which is that the US, both left and right, has been taken over by
       | a neoliberal feudal antidemocratic nativist mindset that is
       | anathema to the sort of free interchange of ideas that underlay
       | the ArXiv's development in the hands of mathematicians and
       | physicists now swept aside and ignored by machine learning
       | grifters and technicians who program computers.
        
         | doctorwho42 wrote:
         | As a US based academic, I have to say when I saw the salary I
         | immediately gawked. I think it's not americans but silicon
         | valley-ites and tech bros on here who have lived with inflated
         | salary/net worth that think it's just a middle of the road
         | salary. As I regularly interact with friends in engineering who
         | make like $200k + benefits ($), and I wonder why I don't jump
         | ship to that weird land.
        
       | juped wrote:
       | >Cornell, for example, had a limited capacity to pay software
       | developers to maintain and upgrade the site, which still has a
       | very no-frills look and feel.
       | 
       | arXiv is doomed. It was nice while it lasted.
        
         | oscaracso wrote:
         | I am not a software engineer, although I do write programs.
         | What is it about digital infrastructure that requires
         | maintenance? In the natural world, there is corrosion, thermal
         | fluctuation, radiation, seismic activity, vandalism,
         | whathaveyou. What are the issues facing the arxiv demanding the
         | attention of multiple people 'round the clock?
        
           | bonoboTP wrote:
           | They have to update the software stack, replace usage of
           | deprecated APIs, support new latex packages etc. They could
           | probably minimize these by limiting the scope but just
           | keeping a small, tightly scoped software functional is always
           | boring, people want to work on fun new features, they enjoy
           | the brand recognition and feel like they should do more
           | stuff.
           | 
           | I wonder when they will introduce the algorithmic feed and
           | the social network features.
        
       | taormina wrote:
       | Given that Cornell charges what, $50k a year as an Ivy League,
       | $300k feels like almost nothing.
        
         | PaulHoule wrote:
         | This is going to be in NYC where $300k does not go as far as it
         | does in Ithaca.
        
         | peyton wrote:
         | Heh, you might want to look up what they're charging young
         | people now.
        
           | taormina wrote:
           | $71k?! Well, that's 4, 4.5 students worth of tuition then.
        
       | losvedir wrote:
       | arXiv is great. It's just a problem that there's so much slop.
       | What if arXiv offered a subscription service that people in
       | different fields could use to just see a curated selection of the
       | top papers in their field each month. Established researchers in
       | each field could then review some of the preprints for putting
       | into the curated monthly list.
       | 
       | Oh, wait.
        
         | bonoboTP wrote:
         | > see a curated selection of the top papers in their field
         | 
         | https://www.scholar-inbox.com
        
       | hereme888 wrote:
       | From my limited experience, arXiv appears to include many low-
       | quality, unreproducible papers, and some are straight-up self-
       | marketing rather than serious scientific work.
        
         | kingstnap wrote:
         | If you get some more experience you will find normal journals
         | are exactly like that as well.
        
       | whiplash451 wrote:
       | I'm not sure why we're so focused on filtering what gets into
       | arxiv (which is an uphill battle and DOA at this point) vs fixing
       | the _indexing_ , i.e. the page rank of academia.
       | 
       | Google "sorted out" a messy web with pagerank. Academic papers
       | link to each others. What prevents us from building a ranking
       | from there?
       | 
       | I'm conscious I might be over-simplifying things, but curious to
       | see what I am missing.
        
         | tokai wrote:
         | Page rank was inspired by bibliometrics and evaluation of
         | science publications. It's messed up now because of the
         | rankings. Further fiddling with ranking will not fix the
         | problem.
        
           | j2kun wrote:
           | +1, PageRank was taken from academia. They even cited it in
           | their original work. Funny how the origins of these things
           | get forgotten.
        
         | krick wrote:
         | I am of the same opinion, and ultimately ArXiv becoming a
         | journal that can prevent one from publishing a paper -- no
         | matter how junk it is -- would pretty much kill its purpose.
         | But I suppose that now when flooding the interned with LLM-
         | generated garbage is almost endorsed by some satanic people, it
         | is pretty much a security issue to have some sort of filter on
         | uploads.
         | 
         | Now, honestly, I have no idea why would one spend resources on
         | uploading terabytes of LLM garbage to arXiv, but they sure can.
         | Even if some crazy person is publishing like 2 nonsense papers
         | daily, it is no harm and, if anything, valid data for
         | psychology research. But if somebody actually floods it with
         | non-human-generated content, well, I suppose it isn't even that
         | expensive to make ArXiv totally unusable (and perhaps even
         | unfeasible to host). So there has to be some filtering. But
         | only to prevent the abuse.
         | 
         | Otherwise, I indeed think that proper ranking, linking and
         | user-driven moderation (again, not to prevent anybody from
         | posting anything, but to label papers as more interesting for
         | the specific community) is the only right way to go.
        
         | muhneesh wrote:
         | tangentially related: https://readabstracted.com/
        
       | Drblessing wrote:
       | ArXiv is dead. Expect a paywall within three years, or other
       | enshittification and slop added.
        
         | Apocryphon wrote:
         | Maybe they'll do something like what Anna's Archive did
        
       | hirako2000 wrote:
       | Do research papers published on Elsevier's sort of media remain
       | more prestigious?
       | 
       | I read a dozen papers a month, typically on arxiv, never from
       | paywalled journals. I find the quality on par. But maybe I'm
       | missing something.
        
         | Fomite wrote:
         | This is _very_ variable based on field. HN is heavily biased
         | toward ArXiv-friendly fields.
        
       | krick wrote:
       | It's not that hard to make a mirror or arXiv. Basically, anybody
       | who can pay for hosting (which, I suppose, isn't very cheap now
       | when the whole world uses it). It's a problem to make users
       | switch, because academia seems to have this weird tradition of
       | resisting all practices that, god forbid, might improve global
       | research capabilities and move forward the scientific progress.
       | But then, if arXiv _actually_ becomes unusable, I suppose they
       | won 't really have much choice than to switch?
       | 
       | And, FWIW, I do think that arXiv truly has a vast potential to be
       | improved. It is currently in the position to change the whole
       | process of how the research results are shared, yet it is still,
       | as others have said, only a PDF hosting. And since the
       | universities couldn't break out of the whole Elsevier & co. scam
       | despite the internet existing for the 30 years, to me, breaking
       | free from the university affiliation sounds like a good thing.
       | 
       | But, of course, I am talking only about the possibilities being
       | out there. I know nothing about the people in charge of the whole
       | endeavor, and ultimately in depends on them only, if it sails or
       | sinks.
        
       | tokai wrote:
       | This is exactly what happened last time when scientific
       | publishing got cornered. Journals run by departments and research
       | groups were spun out or sold off to publishers and independent
       | orgs. And they continued to slowly boil the frog over 50 years
       | with fees and gate keeping.
       | 
       | Its especially problematic because while ArXiv love to claim to
       | be working for open science, they don't default to open
       | licensing. Much of the publications they host are not Open
       | Access, and are only read access. So there is definitely the
       | potential to close things off at some point in the future, when
       | some CEO need to increase value.
        
       | lifeisstillgood wrote:
       | I am sure it's a dumb idea but why is there a problem for say the
       | National Science Foundation or something to run a website that
       | replicates ArXiv - if you are from an accredited university or
       | whatever you can publish papers, fulfilling the "pdf store"
       | function.
       | 
       | Then getting peer reviewed is a harder process but one can see
       | some form of credit on the site coming from doing a decent
       | reviewers job.
       | 
       | I suspect I am missing a lot of nuance ...
        
         | prepend wrote:
         | The moderation is difficult but not unprecedented.
         | 
         | I think NIST hosts the CVE repo (through a contract to MITRE)
        
         | Fomite wrote:
         | Given the last two years and what has been done to science
         | funding, having a load bearing thing like ArXiv not housed with
         | the U.S. government is, I think, pretty self-evidently a good
         | idea.
        
       | MetaMonk wrote:
       | https://youtu.be/4P5xSntVWQE
        
       | jeremie_strand wrote:
       | ArXiv provides such an easy interface to navigate scientific
       | papers, most are from computer science of course. Hope they can
       | grow bigger and solve the paywall pain in open research. Any
       | implication to Bioxiv?
        
         | Fomite wrote:
         | bioRxiv is already housed at Cold Spring Harbor Laboratory,
         | which is an independent non-profit.
        
       | AccessScan wrote:
       | Going independent makes sense for arXiv. But the more interesting
       | part is what it tells us about how we fund the stuff that
       | actually keeps research moving. arXiv runs on about seven million
       | dollars a year and handles hundreds of thousands of papers.
       | That's roughly twenty bucks a paper. This is the backbone of how
       | physicists, computer scientists, and mathematicians share work.
       | Traditional publishers charge thousands per article. The math is
       | almost laughable. arXiv has never had an efficiency problem. The
       | problem is that we've just accepted that something this important
       | should survive on voluntary contributions and the occasional
       | donation saving the day. Look at what happened with bioRxiv and
       | medRxiv when they spun off into openRxiv. That only happened
       | about a year ago. Nobody knows yet if it actually works long-term
       | or if it just kicks the money problems down the road. But both
       | platforms, totally separately, came to the same conclusion. We
       | need to leave the university. That says something. Universities
       | aren't built to fund outside infrastructure forever. Their
       | budgets follow enrollment, grants, and endowment performance.
       | That doesn't line up with the steady, predictable funding arXiv
       | needs to keep the lights on. Ginsparg calling it a "Perils of
       | Pauline" situation is probably the most honest thing anyone said
       | about this. Everyone treats arXiv like it will always be there.
       | But it's been one bad year away from serious trouble for most of
       | its life. The real test for the nonprofit won't be the first few
       | years. Cornell and Simons have that covered. It'll be five or ten
       | years from now when the excitement fades and they're competing
       | for donor money against whatever the next crisis in academic
       | publishing turns out to be. The worry about AI-generated junk is
       | actually where independence could help. A university-hosted arXiv
       | could only spend so much on moderation tools. An independent org
       | with a focused mission can make that a real budget priority.
       | Whether they can keep up with the flood of low-quality
       | submissions is a different question entirely.
        
       | ide0666 wrote:
       | The endorsement system is a real barrier for independent
       | researchers. I've been trying to get endorsed for cs.NE for weeks
       | -- the work is published on aiXiv with video results, but without
       | an institutional email or personal connection to an existing
       | author, you're stuck. Glad to see arXiv thinking about
       | independence -- hope they also rethink access for non-
       | institutional researchers.
        
       | tamimy wrote:
       | It's quite interesting to see that a lot of opinions here think
       | ArXiv will turn to shit because it will go "corporate". Are there
       | any examples where this has not been the case?
        
       | beezle wrote:
       | I go back to xxx.lanl.gov days - that is, the beginning. Back
       | then it was all physics, some math and a little quantitative
       | finance (not bitcoin). And the quality was pretty good because it
       | was a _preprint_ archive. In fact, a headline from 2000:
       | 
       | APS and BNL Host XXX e-Print Archive Mirror Feb. 1, 2000
       | 
       | The APS is establishing, in cooperation with Brookhaven National
       | Laboratory, the first electronic mirror in the United States for
       | the Los Alamos e-Print Archive.
       | 
       | Today, from the landing page, it describes itself as "arXiv is a
       | free distribution service and an open-access archive for nearly
       | 2.4 million scholarly articles in the fields of [long list].
       | Materials on this site are not peer-reviewed by arXiv.
       | 
       | Well, that's a large part of the problem. A lot of the stuff
       | there now will never see a journal (even of dubious quality) and
       | there is limited filtering of what new submissions will be
       | stored. GIGO.
       | 
       | Best thing ArXiv could do is go back to their roots - limit the
       | fields and return to preprint only. Spin off the comp sci stuff
       | for sure to someone else along with all its headaches.
       | 
       | fixed: url
        
       ___________________________________________________________________
       (page generated 2026-03-21 23:01 UTC)