[HN Gopher] Behind the scenes: the struggle for each paper (2021)
___________________________________________________________________
Behind the scenes: the struggle for each paper (2021)
Author : lazyjeff
Score : 120 points
Date : 2024-01-07 15:58 UTC (7 hours ago)
(HTM) web link (jeffhuang.com)
(TXT) w3m dump (jeffhuang.com)
| darthoctopus wrote:
| [2021]
| ketzo wrote:
| What a great resource, both for self-reflection and for a student
| who wanted to chase a similar career. I should really do
| something similar for my history of paid work.
|
| It's not like I have a crazy illustrious career or anything, but
| it can feel like kind of a blur, just a rollercoaster that led
| inexorably towards the present, which couldn't be further from
| the truth; I would love to be able to reflect on my successes
| (and failures!) and see the small, concrete steps I took towards
| each.
|
| Even without writing it out, I know the connections I have made
| and the mentors / coworkers / friends who have helped me deserve
| _much_ more credit than any individual strokes of brilliance on
| my part! Another thing that's very easy for me to forget, day-to-
| day.
| ShadowBlades512 wrote:
| I have started to at least write 4-5 bullet points per month of
| my job in my personal notes as a reminder of what stuff I have
| done. I find I will remember a lot of details as long as I have
| notes that remind me that a project even existed or an event
| happened. That has been enough for me.
| ketzo wrote:
| That's a great idea, I think I'll start doing that. Sounds
| super worthwhile for very little effort.
| fallon54 wrote:
| AKA why you probably don't want to be in academia
| ajsnigrutin wrote:
| Yep... Everyone in academia complains about publishing papers,
| about the high prices of publishing, about "publish or perish",
| and then when they come high enough in "academia", require the
| same pain from the newcomers. It's like a closed circle of
| people both requiring papers for maintaining and advancing your
| carreer and at the same time complaining about those papers
| (and the publishing process), and not even thinking about some
| kind of "change".
| JohnKemeny wrote:
| This is only true to some extent, having myself been on a
| fair number of hiring committees.
|
| While the institution and national agencies measure impact in
| terms of number of "level 1/level 2" papers, colleagues don't
| care at all about this value. What's important is number of
| single-author papers, number of papers without their advisor,
| number of different small group collaborations, and most of
| all, having papers accepted in the top venues.
|
| A person with 50 shit papers will not even be considered for
| the job.
| ajsnigrutin wrote:
| Sure, but the same group of people that complains about
| (the publishing of) papers, is then in a position to change
| that, but doesn't. All those people that went through this
| process and complained, then (well.. a few years later) sit
| in univeristy comitees that decide what the hiring (and
| scoring) rules are, what are the requirements fo TAs, for
| professors and tenures, etc., and decide, that the "pain of
| publishing" is an ok thing to subject new generations to.
|
| edit: i'm from slovenia, univeristies here are "autonomous"
| (not a direct part of the government.. except for being
| government funded), and they decide all the internal rules
| themselves.
| cvwright wrote:
| The problem is that academia is otherwise a pretty cushy
| job. It attracts a lot of people who want the prestige
| and like to talk, but don't actually want to do any work.
|
| Peer review and the paper chase are the least bad
| solution that we've come up with to address this.
| academia_hack wrote:
| I've giving "Accept - Minor Revisions" to every paper I've
| peer reviewed since getting my PhD other than two that were
| outright plagiarism. Figure it's important to the morale of
| grad students to get some positive validation and the vast
| majority of published research is garbage anyways so I don't
| feel particularly inclined to defend the trash heap as an
| unpaid reviewer. In practice, I find that I've tipped the
| scales in favor of a lot of borderline papers over the years
| and am quite happy about that.
| vladms wrote:
| I think most of the things in life come with a lot of struggle,
| strange things, things that should be different and so on.
|
| Making a startup? Go and check how hard and crazy that is. Make
| a family? Similar convoluted process with ups and downs.
|
| What I think is wrong is that people have a very "idealized"
| image of a scientist scribbling on a board and equation and
| getting some prize (or defeating the aliens). These images are
| good for kids but after high-school I think people should give
| it a thought and say "ok, things are not exactly how I imagined
| in life, lets try to understand more what I like and want". You
| know the same process that makes people realize there is no
| Santa Claus.
| halgir wrote:
| No way - reading this I thought I recalled one of the papers
| (Starcraft from the Stands). Pulled up my Zotero library, and
| sure enough, I cited it in my BA thesis almost ten years ago.
|
| What a pleasant coincidence - thanks for the contribution!
| schneems wrote:
| I had an assignment in the OMSCS course where we had to turn the
| results of a project into a paper and a presentation. It was eye
| opening on why so many CS papers are difficult to decipher.
|
| I'm used to writing on the web where the scroll is unlimited and
| everything is hyperlink able and potentially interactive. Journal
| papers are limited by length and so was our assignment. I had to
| cut virtually all helpful explanation needed to reproduce my
| results which was deeply frustrating. We were implementing an
| algorithm based on another paper and it was hard because key
| details were omitted or assumptions not stated. After that
| exercise I have to think some of it was intentional to get it
| down to size.
|
| I find most people aren't good at technical communication and
| teaching others without a LOT of practice. Even then it requires
| feedback and iteration to make sure the ideas are communicated
| well. Forcing people to be more succinct and omit details makes
| the final product worse to consume. I don't know how common such
| limitations are these days, but I do know that the average paper
| is still out of reach of the average programmer (where it would
| likely have the most benefit).
| godelski wrote:
| > Journal papers are limited by length and so was our
| assignment
|
| I have always thought this was a bit silly and that it creates
| really weird effects that also decrease readability. An
| interesting point is that reviewers are not required to read
| the appendix of works. So everything is required to be in the
| front matter. This is a bit silly when we do things like
| research graphics or do generative works and such. You want to
| include images and samples but then your space is eaten up.
| What if you want to discuss analysis on those images and
| explore some? You could easily do this on a blog but you're
| forced to throw this into the appendix. But then a reviewer can
| ask a question that's explained there and your work can still
| get rejected because it isn't in the front matter. Another
| weird incentive is that people end up padding works to fit page
| limits. This is because if you turn in a shorter paper
| reviewers will frequently reject your work the same way your
| boss might not think you're working if they don't see you at
| your desk.
|
| We live in the 21st century and we still publish like it's the
| 15th. Computers gave us the ability to embed images, which is
| why there are so many more graphs and charts now, and it's not
| like more pages cost more. So just remove it. Some papers
| should be only a few pages and there's nothing wrong with that.
| Some papers should be far larger and there's nothing wrong with
| that. It's just weird to set these up considering they were
| likely created under other constraints but momentum continued
| and we back justify the continued decisions (there is something
| to be said about readability, but that can just be a reason to
| reject).
|
| Side note: CS groups typically publish in conferences
| outrun86 wrote:
| Distill.pub was one effort to modernize publishing in CS.
| Chris Olah wrote some thoughts [1] about why he didn't feel
| it was tenable. Seems like the primary challenge was the
| additional effort and skill involved in crafting rich-
| content/interactive material.
|
| [1] https://distill.pub/2021/distill-hiatus/
| godelski wrote:
| Honestly, I don't get why we don't just submit to
| OpenReview and call it a day. Paper is visible and
| distributed. There are comment sections where peer review
| can not just happen, but happen in the open (added bonus!).
| You can iterate and even see the difference between
| submissions. What is the conference/journal providing that
| isn't covered here? A stamp of approval? From a well known
| noisy system that creates other disincentives?
| sideshowb wrote:
| Promotion and filtering I guess? What does a record label
| provide when you can just upload music to Spotify?
| godelski wrote:
| > What does a record label provide when you can just
| upload music to Spotify?
|
| I believe this is an illustrative example in support of
| my proposition, not against. Many artists are in fact
| turning away from record labels in favor of self
| publishing. Similarly for books.
|
| But I will say that I still think there's value and so
| I'll expand on my ideas about conferences. I think they
| should exist, but be focused on meet and greets. So
| instead of being an indicator of the validity of work,
| have them invite authors to speak about their works.
| Allow others to sign up for poster sessions. How to do
| that appropriately does need to be worked out, but
| there's nothing wrong with it simply being under
| recommendation from the advisement of the organizing
| members. Yes, there will still be preferential bias, but
| I do mean "still" because we do have preferential biases
| towards certain institutions and labs. This would just
| make it a bit more explicit that they are not the
| arbitrators of quality but just treated as a "reward."
|
| Importantly I think this allows opening the doors for
| different kinds of research that are not incentivized by
| our systems. Most important being reproduction
| vladms wrote:
| Not sure the openness of the review would solve so many
| problems of the system. For example would not touch like
| reproducibility and data and code availability.
|
| Then you will need moderation (or do you imagine that
| things will be civilized between people on the internet?)
| and would need to manage various possibilities of
| bullying/targeting/etc. Of course these things can happen
| now, but difference would be between a potentially fully
| automated and simple system and something very clunky (be
| friends with an editor, convince him to report who are
| the reviewers, manage to recognize another of his papers,
| etc.)
| godelski wrote:
| > For example would not touch like reproducibility and
| data and code availability.
|
| These are different issues, which are certainly
| important. But I do think in some way this would help.
| OpenReview does allow you to post comments many months
| after. Effectively think about this as a GitHub issues
| page. It certainly could be organized better but it is
| better than what exists now. OR also has links for code
| and community implementations (as does arxiv now). Here's
| an example that has all these things[0]. Granted data is
| missing, but I don't see why this can't also be
| integrated, but would need to also push cultural norms.
|
| > Then you will need moderation
|
| I think OR has this a bit solved, similarly arxiv. They
| are not anonymous accounts and are tied to your ORCID
| record. Arxiv requires you to have a verifier that is
| already someone with an arxiv account. Yes, this can be
| abused, but it is also an easier moderation problem that
| say Reddit or HN even. I think if you're posting bullying
| comments under a named profile, then it is good that that
| is visible so others can see. Mind you, bullying does
| already exist but it is just behind closed doors. It is
| worse now because only the Area Chair can take action and
| often they are over worked and works do get dismissed
| (which results in A LOT of wasted time, and money)
| because of this bullying. The larger the field, the more
| noise too and the more this happens. It is just far less
| common to see people bullying in public than behind
| closed doors.
|
| I must stress though, that there is no perfect system
| here. There is no system that can make the amount of
| bullying 0. So we have to be careful in our critiques
| because there will always be valid critiques that are in
| fact of concern (like this one) but are fundamentally
| unsolvable. The question then becomes if we improve upon
| the existing frameworks and if whatever costs have been
| made are worth the added benefits. So I just want to make
| sure that this idea isn't killed because an impossible
| bar, despite the critique being valid.
|
| Edit: I'd actually add that this system encourages
| reproduction. Because if we still measure on citations
| and number of publications this means that reproduction
| works can still count towards those metrics and thus
| someone's career advancements. The whole
| conference/journal system currently discourages such
| effort in favor of the absurdly nebulous novelty concept
| (which also makes papers noisy). My proposal would also
| allow for the publication of failures, which is also an
| important thing for academics.
|
| [0] https://openreview.net/forum?id=Hkxzx0NtDB
| jltsiren wrote:
| Page limits force you to focus. As a researcher, you are
| often expected to communicate your ideas in 1 page, 3 pages,
| 10 pages, or 30 pages, for various purposes. If a journal
| asks for a 10-page paper, you write a 10-page paper. If a
| conference asks for a 1-page abstract, you write a 1-page
| abstract. Most people reading a paper are not interested in
| going through all the details, and those details should
| usually not be in the main paper.
|
| It's also easier to find reviewers for short papers than for
| long ones.
|
| Some the issues you mention are specific to CS conferences.
| Because there is only time for 1-2 rounds of reviews, the
| reviews focus more on accepting/rejecting the paper and less
| on clearing any misunderstandings before judging it.
| Conferences are are also more likely to have one-size-fits-
| all page limits, while journals often have several catagories
| of papers with different expectations of length.
| sideshowb wrote:
| I think desirability of page limits is very subject
| specific. Some people will just waffle if you don't give
| them a page limit. Other times it means there's not room
| for the technical details.
| godelski wrote:
| But the reviewers can reject if it isn't enough or reject
| if it is too much. What I'm arguing is the alignment
| mechanism already exists. The page limit is over
| constraining
| godelski wrote:
| > Page limits force you to focus.
|
| This can be solved in better ways, which is, in fact,
| reviewers. I'm okay with a soft requirement but a
| standardization is what I'm getting at as being
| problematic. Some papers are noisy because they should be 3
| pages but are 10. Some papers are noisy because they are 10
| pages and should be 30. There is no universal rule, and
| that's what I'm getting at.
|
| > It's also easier to find reviewers for short papers than
| for long ones.
|
| That's a separate problem that needs to be addressed, but
| is not easy.
|
| > Some the issues you mention are specific to CS
| conferences.
|
| Yes, but the author here is CS and we are on a CS focused
| website. But in general what I said isn't specific to
| conferences. If conferences are the problem then let's
| abandon them in favor of good science instead of keeping
| them around (or turn them into being meetup focused).
| Certainly the lack of back and forth between authors and
| reviewers is not a meaningful review process (most author
| rebuttals are limited to one page and often reviewers are
| not aligned in critiques). Are we all on the same team
| (better science) or strictly competing against one another?
| taopai wrote:
| Papers... our new religion...
| godelski wrote:
| > But this paper was critical to getting me accepted to a Ph.D.
| program. Why do I think that? Well I was rejected by every Ph.D.
| program I applied to before this publication (but that's another
| story), a story about people and opportunity.
|
| This is an interesting note. We're talking about a student from
| one of the top CS schools (UIUC) and applying to another top
| school (UW). If you think about this a bit carefully, the paper
| being published did not change who he was or his capabilities, it
| was simply a difference in measured (distinct from measurable)
| signal.
|
| It's incredible how many extremely noisy signals we use in
| academia but act as if we use a clear meritocracy. The review
| process is extremely noisy itself, with computer science in
| particular being generally more noisy given its preference of
| conferences over journals. I'm glad Jeff mentions people and
| opportunities, and it reminds me of the old saying about there
| being no self made man. But I think this is a very clear example
| of a instance where we need to think harder and more carefully.
| Counterfactually, it is almost certain that had that paper been
| rejected, but all else stays the same (i.e. getting into UW), his
| success story would also not change. Signals are definitely hard
| to measure and certainly schools are getting a lot of applicants,
| so I don't blame anyone for doing this, but I think it is
| incredibly important to remember these counterfactuals. To
| remember that metrics are guides and not causal variables
| themselves. Because there's a great irony in that metrics destroy
| meritocracies.
| nkurz wrote:
| Your point is correct, but I'm not surprised by the difference.
| I think "legibility" is the term of art here. Writing a paper
| like this makes it almost[1] certain to the institution he is
| applying to that he is capable of writing a paper of this
| quality, while all the other metrics (GPA, GRE, etc) are much
| more probabilistic. Since someone incapable of writing such a
| paper is probably unsuited for a PhD, it seems entirely
| appropriate to choose applicants who have demonstrated ability
| to clear this bar over those that have not.
|
| [1] "Almost" to account for the slight chance that he didn't
| actually author the paper but somehow managed to get his name
| put on it anyway.
| godelski wrote:
| I agree and there's a lot in my comment to point to that. But
| my point is to distinguish between the metrics and the goals.
| I'm certain the author included in their CV that they had a
| pending paper when applying, so there is a signal, albeit a
| weaker but publishing is a weak signal to begin with.
|
| I agree that you need to use metrics. But we need to be clear
| that metrics are not enough and very incomplete themselves.
| With something like admissions, I'm not sure there's anything
| except noisy signals and the strongest one by far is the
| interview.
|
| > Since someone incapable of writing such a paper is probably
| unsuited for a PhD,
|
| I very much disagree with this. The explicit purpose of
| schooling is to train people. Many undergrads are not going
| to have the opportunities to publish. It is not hard to train
| someone to write something publishable and this is not
| something I would be much concerned with myself given how
| much writing they're going to be doing over the next few
| years. The far more valuable skills are in being able to
| perform research which is quite ambiguous (there are at least
| 2 ways to read this sentence and both are correct: research
| type v measure). Your first 2 years of your PhD are almost
| exclusively training, with more class work and learning how
| to begin research. This isn't a job you're applying for, it
| is a training program.
| nkurz wrote:
| >> Since someone incapable of writing such a paper is
| probably unsuited for a PhD
|
| > I very much disagree with this.
|
| Your disagreement is justified. I phrased that poorly. I
| meant it as a shorthand for "incapable of being trained to
| write such a paper". Showing that you already have the
| skill is proof, everything else just points to the
| possibility with varying degrees of accuracy.
|
| I in turn disagree that "the purpose of schooling is to
| train people", at least if "schooling" refers to PhD
| programs. I think it's more that there aren't enough
| applicants who are able to perform without extensive
| training, so in practical terms PhD programs need to be
| willing to provide training. But at the same time, it's
| perfectly understandable that they would prefer to take
| applicants who have demonstrated ability to perform over
| those with statistical potential.
|
| I'd prefer something like "The purpose of PhD programs is
| to advance the field". I'm personally in the odd category
| that I've co-authored several computer science research
| papers despite having dropped out to become a programmer
| prior to my BA. I've demonstrated my ability to perform
| much of the role of a PhD while simultaneously
| demonstrating that I perhaps shouldn't be relied upon to
| finish!
| godelski wrote:
| I see your point and I think that brings us a bit closer
| to alignment. But I think if someone is __incapable__ of
| writing such a paper there would likely be many larger
| flags and they probably should not have been able to pass
| their undergraduate curriculum.
|
| I do want to make it clear: I'm not opposed to arbitrary
| filters when there is a high number of applicants and you
| simply need to reduce the number. I am opposed to
| pretending that such a filter is not arbitrary. I think
| we need to be clear about how strong of a signal any
| filter is, and be quite explicit that they are not all
| equal indicators. That is my main point: being explicit
| about the strength of a signal.
|
| On regards to training, I do agree that schooling isn't
| __just__ training, but I'd fully disagree that this isn't
| one of the most important aspects of it, even in grad
| school. Your first two years (in US systems) are nearly
| identical to a masters and highly focused on classes.
| What are classes but training? Even being a TA or
| lecturer is, in part, training (full instructor of record
| would not be). Post conditional, I still think you are in
| training at least up until candidacy. That is much more
| arguable given the variability of advisors, with some
| being very hands on (training) and some being very hands
| off (on your own).
|
| I'd prefer something as "The purpose of PhD programs is
| to train people to advance the field." Because by all
| accounts, it seems like you've done this (even with the
| self-deprecating humor. That is exceptionally common in
| PhDs too lol). I still maintain training because this
| isn't the end, but the beginning. Post PhD is where you
| can choose to go to be an academic researcher or industry
| researcher (or abandon research). Those are the actual
| jobs (which should have continued training) but your
| degree is more akin to a certification from your
| institution. You do come out with a body of work that is
| distinct from the institution, but the institution's goal
| is not to keep you around and continue performing work.
| They are explicitly formed to graduate you. To educate
| you. And what is education except a form of training?
|
| Fwiw, I think we are decently aligned, but sometimes text
| is hard to communicate, especially post by post. I do
| think your critiques are valid, even where I disagree.
| dlemire wrote:
| > I'd prefer something like "The purpose of PhD programs
| is to advance the field".
|
| If you read Wikipedia under 'Doctor of Philosophy', you
| will find that a Ph.D. was once more of a prestigious
| title you got after doing the scholarship:
|
| "The first higher doctorate in the modern sense was
| Durham University's DSc, introduced in 1882. This was
| soon followed by other universities, including the
| University of Cambridge establishing its ScD in the same
| year and the University of London transforming its DSc
| into a research degree in 1885. These were, however, very
| advanced degrees, rather than research-training degrees
| at the PhD level--Harold Jeffreys said that getting a
| Cambridge ScD was "more or less equivalent to being
| proposed for the Royal Society."
|
| It is still possible to get a doctorate in this manner.
| Please see wikipedia under 'Doctor of Philosophy by
| publication'.
|
| "A Doctor of Philosophy by publication (also known as a
| Ph.D. by Published Work, PhD by portfolio or Ph.D. under
| Special Regulation) is a manner of awarding a Ph.D.
| degree offered by some universities in which a series of
| articles usually with a common theme are published in
| scholarly, peer-reviewed journals to meet the
| requirements for the degree, in lieu of presentation of a
| final dissertation. Many PhD by Publication programs
| require the submission of a formal thesis and a viva
| voce."
|
| It is offered in several countries in Europe. The
| wikipedia entry is incomplete: it is not just offered in
| the UK.
|
| Furthermore, it is relatively common to get advanced
| degrees from well known universities (e.g., Harvard)
| without having an undergraduate degree.
| BrandoElFollito wrote:
| Another thing is that there is not enough pushback from the
| community at large.
|
| My PhD thesis was less than 40 pages long. The introduction was
| 1/2 a page (basically "if you need an introduction you should not
| read this, here are 3,4 books to get you started").
|
| Then I copied/pasted from my articles and then came the
| acknowledgments (which I actually fund valuable because I wanted
| to thank my advisor for his non-science-related help and a friend
| for her magnificent idea that turned around the thesis. And my
| parents, wife, dog etc.)
|
| Then the conclusion ("brilliant work")
|
| And then a discussion with myself about everything that I fucked
| up and what could be improved (my advisor fainted on that one).
|
| The jury was 8 people. The younger/more dynamic ones were super
| happy (especially that they made their review a page long as
| well). The older ones were disgusted and said that clearly. I got
| my PhD.
|
| I fought in Academia for a few years to bring some change but
| eventually left (also for other reasons). If I was to stay for my
| whole career I would have tried again and again to change the
| status quo.
| dotnet00 wrote:
| A friend who recently finished her PhD had a similar
| experience, where all the senior scientists at our lab were
| concerned because her thesis was "only" 100 pages long and she
| didn't go through a professional editor to have it perfected.
|
| My preliminary defense thesis had to be 50+ pages, but during
| the presentation, it was pretty obvious that the committee had
| at best looked at the table of contents. It all feels like such
| an unnecessary waste of effort. Even with my own thesis, over
| half of it is just padding with very fundamental background
| information because the work isn't really so complicated as to
| require that many pages to discuss, it's just demonstrating
| more advanced simulation capabilities by implementing GPU
| acceleration for a niche but simulation heavy field.
| BrandoElFollito wrote:
| Theses at my time were about 200 pages long. A friend of mine
| wrote two _tomes_.
|
| I clearly stated that I would not waste my time and the
| reviewers are free to provide comments and we will see during
| the defense.
|
| I found out that a lot of these "rules" are traditions that
| one can challenge and suddenly they are not traditions
| anymore.
| dotnet00 wrote:
| Yes, the only hard rule in my department is having a
| minimum of 50 pages, the idea that 100 pages is not enough
| came from the scientists applying their own experiences
| from years ago. Technically there was nothing they could do
| about her thesis having fewer pages, but as inexperienced
| students, it's obviously a little scary when people you
| look up to sound concerned (since academia is full of all
| sorts of unproductive and unstated expectations).
|
| A friend at another department only had a minimum
| requirement of 5 pages, and his thesis ended up being just
| a collection of his publications.
| amadeuspagel wrote:
| There's this new thing that some academics are working on at CERN
| - kind of like academic papers, with references and so forth, but
| on the computer.
|
| Once this is ready, people will just be able to publish their
| "papers" there. I guess they'll be called something else then.
| But this sort of struggle to publish a "paper" will no longer be
| necessary.
| jll29 wrote:
| Thanks for sharing a behind-the-curtain view on the history of
| your publications.
|
| Thank you even more for publishing WebGazer and for following a
| "systems" approach in your research, when most people produce
| only papers. It's systems as research artifacts that encode the
| exact methods as described in the papers but in sufficient detail
| to be executable that drive innovation. Sadly, system papers are
| rather hard to publish, despite taking longer (software that is
| released needs to be much more polished than software that you
| are going to keep to yourself).
| patrickmay wrote:
| Nearly three times the number of papers published by Claudine
| Gay. Why isn't he President of Harvard?
| vaidhy wrote:
| I downvoted this comment because it is not pertinent to this
| topic and just flamebait.
___________________________________________________________________
(page generated 2024-01-07 23:00 UTC)