hngopher.com

       [HN Gopher] 1,600 days of a failed hobby data science project
       ___________________________________________________________________
        
       1,600 days of a failed hobby data science project
        
       Author : millimacro
       Score  : 146 points
       Date   : 2024-12-08 21:29 UTC (1 days ago)
        
 (HTM) web link (lellep.xyz)
 (TXT) w3m dump (lellep.xyz)
        
       | fardo wrote:
       | The author's right about storytelling from day one, but then
       | immediately throws cold water on the idea by saying it would have
       | been a bad fit for this project.
       | 
       | This feels in error, as the big value of seeking feedback and
       | results early and often on a project is that it forces you to
       | confront whether you're going to want or be able to tell stories
       | in the space at all. It also gives you a chance to re-kindle
       | waning interests, get feedback on your project by others, and
       | avoid ratholing into something for about 5 years without having
       | to engage with a public.
       | 
       | If a project can't emotionally bear day one scrutiny, it's
       | unlikely to fare better five years later when you've got a lot of
       | emotions about incompleteness and the feeling your work isn't
       | relevant anymore tied up in the project.
        
         | rixed wrote:
         | Would you be able to recommend a project whom author did engage
         | in such public story telling from early on?
        
           | Swizec wrote:
           | Thinking Fast and Slow is a result of some 20 years of
           | regularly publishing and talking about those ideas with
           | others.
           | 
           | Most really memorable works fit that same mold if you look
           | carefully. An author spends years, even decades, doing small
           | scale things before one day they put it all together into a
           | big thing.
           | 
           | Comedy specials are the same. Develop material in small scale
           | live with an audience, then create the big thing out of
           | individual pieces that survive the process.
           | 
           | Hamming also talks about this as door open vs door closed
           | researchers in his famous You And Your Research essay
        
       | rjrdi38dbbdb wrote:
       | The title seems misleading. Unless I'm missing something, all he
       | did was scrape a news feed, which should only require a couple
       | days of work to set up.
       | 
       | The fact that he left it running for years without finding the
       | time to do anything with the data isn't that interesting.
        
         | amelius wrote:
         | Yes, his #1 advice should be "do something with the data you
         | collected".
        
       | plaidfuji wrote:
       | I'm not sure I would call this a failure.. more just something
       | you tried out of curiosity and abandoned. Happens to literally
       | everyone. "Failed" to me would imply there was something
       | fundamentally broken about the approach or the dataset, or that
       | there was an actual negative impact to the unrealized result.
       | It's very hard to finish long-running side projects that aren't
       | generating income, attention, or driven by some quasi-
       | pathological obsession. The fact you even blogged about it and
       | made HN front page qualifies as a success in my book.
       | 
       | > If I would have finished the project, this dataset would then
       | have been released and used for a number of analyses using
       | Python.
       | 
       | Nothing stopping you from releasing the raw dataset and calling
       | it a success!
       | 
       | > Back then, I would have trained a specialised model (or used a
       | pretrained specialised model) but since LLMs made so much
       | progress during the runtime of this project from 2020-Q1 to
       | 2024-Q4, I would now rather consider a foundational model wrapped
       | as an AI agent instead; for example, I would try to find a
       | foundation model to do the job of for example finding the right
       | link on the Tagesschau website, which was by far the most
       | draining part of the whole project.
       | 
       | I actually just started (and subsequently ---abandoned--- paused)
       | my own news analysis side project leveraging LLMs for
       | consolidation/aggregation.. and yeah, the web scraping part is
       | still the worst. And I've had the same thought that feeding raw
       | HTML to the LLM might be an easier way of parsing web objects
       | now. The problem is most sites are privy to scraping efforts and
       | it's not so much a matter of finding the right element but
       | bypassing the weird click-thru screens, tricking the site that
       | you're on a real browser, etc...
        
         | smcin wrote:
         | > Nothing stopping you from releasing the raw dataset and
         | calling it a success!
         | 
         | Right. OP: release it as a Kaggle Dataset
         | (https://www.kaggle.com/datasets) and invite people to
         | collaboratively figure out how to autonate the analyses. (Do
         | you just want to get sentiment on a specific topic (e.g.
         | vaccination, German energy supplies, German govt approval)? or
         | quantitative predictions?) Start with something easy.
         | 
         | > _for example, I would try to find a foundation model to do
         | the job of for example finding the right link on the Tagesschau
         | website, which was by far the most draining part of the whole
         | project._
         | 
         | Huh? To find the specific dates new item corresponding to a
         | given topic? Why not just predict the date-range e.g. "Apr-Aug
         | 2022"
         | 
         | > _and yeah, the web scraping part is still the worst._
         | 
         | Sounds wrong. OP, fix your scraping. (unless it was anti-AI
         | heuristics that kept breaking it, which I doubt since it's
         | Tagesschau). But Tagesschau has RSS feeds, so why are you
         | blocked on scraping?
         | https://www.tagesschau.de/infoservices/rssfeeds
         | 
         | Compare to: Kaggle Datasets "10k German News Articles for topic
         | classification", Schabus, Skowron Trspp, SIGIR 2017
         | [https://www.kaggle.com/datasets/abhishek/10k-german-news-
         | art...]
        
           | IanCal wrote:
           | I'll put a shoutout for https://zenodo.org/ and
           | https://figshare.com/ as places to put your data, where
           | you'll get a DOI and can let someone that's not a company
           | look after hosting and backing it up. Zenodo is hosted as
           | long as CERN is around (is the promise) and figshare is
           | backed by the CLOCKSS archive (multiple geographically
           | distributed universities).
        
         | xelxebar wrote:
         | Personally, I think it's helpful to feel disappointment and
         | insufficiency when those emotions pop up. They are the voices
         | of certain preferences, needs, and/or desires that work to
         | enrich our lives. Recontextualizing the world into some kind of
         | positive success story can often gaslight those emotions out of
         | existence, which can, paradoxically, be self-sabotoging.
         | 
         | The piece reads to me like a direct and honest confrontation
         | with failure. It means the author thinks they can do better and
         | is working to identify unhelpful subconscious patterns and
         | overcome them.
         | 
         | Personally, I found the author's laser focus on "data science
         | projects" intriguing. I have a tendency to immediately go meta
         | which biases towards eliding detail; however, even if overly
         | narrow, the author's focus does end up precipitating out
         | concrete, actionable hypotheses for improvement.
         | 
         | Bravo, IMHO.
        
       | querez wrote:
       | Some very weird things in this.
       | 
       | 1. The title makes it sound like the author spent a lot of time
       | on this project. But really, this mostly consisted of noting down
       | a couple of URLs per day. So maybe 5 min / day = ~130h spent on
       | the project. Let's say 200h to be on the safe side.
       | 
       | 2. "Get first analyses results out quickly based on a small
       | dataset and don't just collect data up front to "analyse it
       | later"" => I think this actually killed the project. Collecting
       | data for several years w/o actually doing anything doesn't with
       | it is not a sound project.
       | 
       | 3. "If I would have finished the project, this dataset would then
       | have been released" ==> There is literally nothing stopping OP
       | from still doing this. It costs maybe 2h of work and would
       | potentially give a substantial benefit to others, i.e., turn this
       | project into a win after all. I'm very puzzled why OP didn't do
       | this.
        
         | apwell23 wrote:
         | yep I spent more time on duolingo for 600+ day streak and can
         | barely speak spanish.
        
           | rrr_oh_man wrote:
           | That seems to be a pattern
        
             | galleywest200 wrote:
             | It is because you never really practice talking with
             | Duolingo. I am quite good at _reading_ French now, though.
        
               | pessimizer wrote:
               | > I am quite good at reading French now, though.
               | 
               | If you are, that's actually quite an achievement and
               | good. If you're talking about French outside of Duolingo,
               | that is.
               | 
               | I do not normally hear of people getting to reading
               | fluency through Duolingo.
        
               | wizzwizz4 wrote:
               | Duolingo used to have a really good feature where you
               | read through and collaboratively-translated texts, but
               | they shut it down years back.
        
               | j_bum wrote:
               | Wow I forgot about that! When I was using it for French
               | many years ago, I imagined they were using it as a way to
               | get generate free translations, but still found it
               | enjoyable and useful.
               | 
               | Wonder why they took it away.
        
               | smcin wrote:
               | Well you can't practice producing unconstrained
               | sentences. Only with their very narrow training-wheels.
        
           | xandrius wrote:
           | Duolingo is a pretty bad tool for learning a language, it's
           | good to make you feel like you're learning though.
        
             | waste_monk wrote:
             | At this point it's more about being scared of the bird.
        
             | mettamage wrote:
             | Just to give a nuanced perspective on duolingo.
             | 
             | My wife only did 50 hours of duolingo in total the past 2
             | years. Combine that with me teasing her in Dutch and she's
             | actually making progress.
             | 
             | Duolingo is a chill tool to learn some vocab. That vocab
             | then gets acquired by talking to me. We talk 2 minutes
             | Dutch per day at most. So about 11 hours in total per year.
             | 
             | She is 67% done with duolingo. So we bought the first real
             | book to learn Dutch (De Opmaat).
             | 
             | That book is IMO not for pure beginners. But for the level
             | my wife was at, it seems perfect.
        
               | selimthegrim wrote:
               | Do you think it would be good for Flemish too or speaking
               | standard Dutch in Belgium?
        
               | mettamage wrote:
               | I don't know how one would learn Flemish from books. I
               | think you'd need to go to Belgium and speak Dutch there
               | and then see what the differences are.
               | 
               | Dutch and Flemish are interchangeable though. Sometimes
               | it falls apart based on accent, but not on language.
        
           | rjh29 wrote:
           | I finished the whole tree in French and had nothing to show
           | for it either. It really is a fun way to feel like you're
           | learning, without connecting you to the language or culture
           | in any significant way.
        
             | Wololooo wrote:
             | It's a useful tool if you're immersed in the language, it's
             | not key to your learning but it can tremendously help.
        
           | MarcelOlsz wrote:
           | Anki is the way, especially with their new FSRS algo.
        
             | bowsamic wrote:
             | Yep, any good textbook or course with Anki for aiding raw
             | memorisation. By far the best way to go
        
           | raister wrote:
           | I feel this whilst learning (trying to) German: when I think
           | "how I would say this in German?" I got nothing less than a
           | blank on my mind. But I'm a good "speaker" though, and sadly,
           | I feel I'm not going anywhere as well...
        
             | katzenversteher wrote:
             | Surround yourself in the language. In Germany we have
             | almost everything dubbed, so you can watch pretty much any
             | popular movie or TV series in German or read any popular
             | book in German. Besides that there are also quite a lot of
             | German productions.
        
               | ben_w wrote:
               | Indeed.
               | 
               | For learners, I'd also currently recommend "Easy German"
               | podcasts and YouTube videos, as they come in all skill
               | levels, are free, and are well made.
               | 
               | https://youtube.com/@easygerman?si=EQdZPHMZ0lPNEl6V
        
             | coffeecantcode wrote:
             | Watch Dark on Netflix in original German on repeat, great
             | way to subconsciously make note of tones and pronunciation
             | while also watching an awesome show. Be very intentional
             | about it though.
        
           | Insanity wrote:
           | For me - nothing beats in-person classes in lieu of a native
           | speaker whom you can interact with. Being forced to actually
           | speak the language in "mock settings" makes all the
           | difference.
           | 
           | And even if you don't get your grammar completely right, you
           | will learn enough to survive in a real-life setting.
           | 
           | I learned Spanish through a combination of both - I took
           | Spanish classes after I started dating my Mexican wife,
           | enough to get conversational. Then I started interacting in
           | Spanish with her family, which helps me now maintain the
           | language without needing the classes.
        
           | ben_w wrote:
           | Likewise, but also about that with Arabic on Duolingo and I
           | never even mastered the alphabet.
        
         | morkalork wrote:
         | Point number 2. is super important for non-hobby projects.
         | Collect a bit of data, even if you have to do it manually at
         | first and do a "dry run" / first cut of whatever analysis
         | you're thinking of doing so you confirm you're actually
         | collecting what you need and what you're doing is even going to
         | work. Seeing a pipeline get built, run for like two months and
         | then the data scientist come along and say "this isn't what we
         | needed" was complete goddamn shitshow. I'm just glad I was only
         | a spectator to it.
        
           | IanCal wrote:
           | They touch on something relevant here and it's a great point
           | to emphasise
           | 
           | > The emphasis on preserving raw HTML proved vital when
           | Tagesschau repeatedly altered their newsticker DOM structure
           | throughout Q2 2020. This experience underscored a fundamental
           | data engineering principle: raw data is king. While parsers
           | can be rewritten, lost data is irretrievable.
           | 
           | I've done this before keeping full, timestamped, versioned
           | raw HTML. That still risks shifts to javascript based things
           | but keeping your _collection_ and _processing_ distinct as
           | much as you can so you can rerun things later is incredibly
           | helpful.
           | 
           | Usually, processing raw data is _cheap_. Recovering raw data
           | is _expensive_ or _impossible_.
           | 
           | As a bonus, collecting raw data is usually easier than
           | collecting and processing it, so you might as well start
           | there. Maybe you'll find out you were missing something, but
           | it's no worse than if you'd tied things together.
           | 
           | edit
           | 
           | > Huh? To find the specific dates new item corresponding to a
           | given topic? Why not just predict the date-range e.g. "Apr-
           | Aug 2022"
           | 
           | They say they had to manually find the links to the right
           | liveblog subpage. So they had to go to the main page, find
           | the link and then store it.
        
         | IanCal wrote:
         | While I understand the points I think it's worth being kinder
         | about someone coming out to write about how they failed with a
         | project.
         | 
         | > 1. The title makes it sound like the author spent a lot of
         | time on this project. But really, this mostly consisted of
         | noting down a couple of URLs per day. So maybe 5 min / day =
         | ~130h spent on the project. Let's say 200h to be on the safe
         | side.
         | 
         | Consistent work over multiple years shouldn't be looked down on
         | like this. If you've done something every day for years it's
         | still a lot of time in your life. We're not econs and so I
         | don't think summing up the time really captures it either.
         | 
         | > 3. "If I would have finished the project, this dataset would
         | then have been released" ==> There is literally nothing
         | stopping OP from still doing this. It costs maybe 2h of work
         | and would potentially give a substantial benefit to others,
         | i.e., turn this project into a win after all. I'm very puzzled
         | why OP didn't do this.
         | 
         | They might not realise how to do this sustainably, they might
         | be mentally just done with it. It may be harder for them to
         | think about.
         | 
         | I'd recommend also that they release the data. If they put it
         | on either Zenodo or Figshare it'll be hosted for free and
         | referenceable by others.
         | 
         | > 2. "Get first analyses results out quickly based on a small
         | dataset and don't just collect data up front to "analyse it
         | later"" => I think this actually killed the project.
         | 
         | I agree, but again on the kinder side (because they also agree
         | I think) there are multiple reasons for doing this and focusing
         | on why might be more productive.
         | 
         | 1. It gets you to actually process the data in some useful
         | form. So many times I've seen things fail late on because
         | people didn't realise something like "how are dates formatted"
         | or whether some field was often missing or you just didn't
         | capture something that turns out to be pretty key (e.g. scrape
         | times then realise that at some point they changed it to "two
         | weeks ago" and you didn't realise).
         | 
         | This can be as simple as just plotting some data, counting
         | uniques, anything. The automated system will fall over when
         | things go wrong and you can check it.
         | 
         | 2. What do people care about? What do you care about? Sometimes
         | I've had a great idea for an analysis only to realise later
         | maybe I'm the only one that cares or worse, the result is so
         | obvious it's not even interesting to me.
         | 
         | 3. Keeping interest. Keeping interest in a multi-year project
         | that's giving you something back can be easier than something
         | that's just taking.
         | 
         | 4. Guilt. If I spend a long time on something, I feel it should
         | be better. So I want to make it more polished, which takes
         | time, which I don't have. So I don't add to it, then I'm not
         | adding anything, then nothing happens. It _shouldn 't_ matter,
         | but I've long realised that just wishing my mind worked
         | differently isn't a good plan and instead I should just plan
         | for reality. For that, doing something fast feels much better -
         | I am happier releasing something that's taken me half a day and
         | looks kinda-ok because
         | 
         | 5. Get it out before something changes. COVID had or has no
         | upfront endpoint.
         | 
         | 6. Ensure you've actually got a plan. Unless you've got a very
         | good reason, you can probably build what you need to analyse
         | things and release it earlier. You can't run an analysis on an
         | upcoming election, but even then you could do it on a previous
         | year and see things working. This can help with motivation
         | because at the end you don't have "oh right now I need to write
         | and run loads of things" you just need to hit go again.
        
       | mNovak wrote:
       | "The data collection process involved a daily ritual of manually
       | visiting the Tagesschau website to capture links"
       | 
       | I don't know what to say... I'm amazed they kept this up so long,
       | but this really should never have been the game plan.
       | 
       | I also had some data science hobby projects around covid; I got
       | busy, lost interest after 6 months. But the scrapers keep running
       | in the cloud, in case I get motivated again (anyone need
       | structured data on eBay listings for laptops since 2020?), that's
       | the beauty of automation for these sorts of things.
        
         | plaidfuji wrote:
         | Do you just pay the bill for the resources indefinitely?
        
           | hansvm wrote:
           | I'm not the person you're asking, but I maintain a number of
           | scraping projects. The bills are negligible for almost
           | everything. A single $3/mo VPS can easily handle 1M QPS
           | (enough for all the small projects put together), and most of
           | these projects only accumulate O(10GB)/yr.
           | 
           | Doing something like grabbing hourly updates of the inventory
           | of every item in every Target store is a bit more involved,
           | and you'll rapidly accumulate proxy/IP/storage/... costs, but
           | 99% of these projects have more valuable data at a lesser
           | scale, and it's absolutely worth continuing them on average.
        
           | NavinF wrote:
           | Inbound data is typically free on cloud VMs. CPU/RAM usage is
           | also small unless you use chromedriver and scrape using an
           | entire browser with graphics rendered on CPU. We're taking
           | $5/mo for most scraping projects
        
           | mNovak wrote:
           | I paying < $0.50 a month, and that's primarily driven by S3.
           | For the scraping itself I'm using lambda, with maybe minutes
           | of runtime per day.
        
       | FrustratedMonky wrote:
       | "Data Science Project Failing After 1,600 Days"
       | 
       | Sounds like my Thesis.
       | 
       | How many people have spent 4+ years on a Thesis then just
       | completely gave up, tired, drained, no interest in continuing.
       | The bright eye'd bushy tailed wonder, all gone.
        
       | dankwizard wrote:
       | I don't speak the language so maybe what you're scraping isn't in
       | this list, but why manual when they seem to have comprehensive
       | RSS feeds? [1]
       | 
       | Automating this part should have been day 1.
       | 
       | [1] https://www.tagesschau.de/infoservices/rssfeeds
        
         | smcin wrote:
         | That's what I just concluded. I think the OP was oversold on
         | the idea of using AI to do scraping, NLP and summarization, all
         | in one go.
        
       | j45 wrote:
       | I don't know that projects ever fail.
       | 
       | Doing them and learning and growing from them is the point.
       | 
       | They shed a light on your path and also what you are able to see
       | as possible.
        
       | ddxv wrote:
       | Why not open source? I've been slaving away at some possibly
       | pointless data scraping sites that collect app data and the SDKs
       | that apps use. I figure if I at least open source it that data
       | and code is there for others to use.
        
       | kqr wrote:
       | I see some recommendations about running a small version of the
       | analysis first to see if it's going to work at all. I agree, and
       | the next level up is to also estimate the _value_ of performing
       | the full analysis. I.e. not just whether or not it will work at
       | all, but how much it is allowed to cost and still be useful.
       | 
       | You may find, for example, that each unit of uncertainty reduced
       | costs more than the value of the corresponding uncertainty
       | reduction. This is the point at which one needs to either find a
       | new approach, or be content with the level of uncertainty one
       | has.
        
       | brikym wrote:
       | I know the feeling. I managed 9 months scraping supermarket data
       | before I gave up mostly because a few other people were doing it
       | and I was short on time.
        
       | barrenko wrote:
       | People relatively new to CS would be wise to be warned about what
       | a colossal time sink it is.
        
       | wodenokoto wrote:
       | > Store raw data if possible. This allows you to condense it
       | later.
       | 
       | I have some daily scripts reading from an http endpoint, and I
       | can't really decide what to do when it returns html instead of
       | json. Should I store the HTML as it is "raw data" or should I
       | just dismiss it? The API in question has a tendency to return 200
       | with a webpage saying that the API can't be reached (typically
       | because of a time out)
        
         | IanCal wrote:
         | I wouldn't store that usually, I'd use that to trigger retries.
         | 
         | For you storing the raw data is storing the json that http
         | endpoint returns rather than something like
         | let content = get(url).json()         info_i_care_about =
         | content['data']['title']         store(info_i_care_about)
         | 
         | as otherwise you'll get stuck when the json response moves the
         | title to data.metadata.title or whatever
         | 
         | It's usually less of an issue with structured data, things like
         | html change more often, but keeping that raw data means you can
         | process it in various different ways later.
         | 
         | You also decouple errors so your parsing error doesn't stop
         | your write from happening.
        
       | tessierashpool9 wrote:
       | the last thing the world or rather germany needs is a news ticker
       | based on ... the tagesschau LOL
        
       | KeplerBoy wrote:
       | Oh boy, the topic (Covid) alone would have left me exhausted
       | after a few months. I heard enough of it by mid 2021.
        
       | rybosworld wrote:
       | > The data collection process involved a daily ritual of manually
       | visiting the Tagesschau website to capture links to both the
       | COVID and later Ukraine war newstickers. While this manual
       | approach constituted the bulk of the project's effort, it was
       | necessitated by Tagesschau's unstructured URL schema, which made
       | automated link collection impractical.
       | 
       | > The emphasis on preserving raw HTML proved vital when
       | Tagesschau repeatedly altered their newsticker DOM structure
       | throughout Q2 2020.
       | 
       | Another big takeaway is that it's not sustainable to rely on this
       | type of a data source. Your data source should be stable. If the
       | site offers API's, that's almost always better than parsing html.
       | 
       | Website developers do not consider scrapers when they make
       | changes. Why would they? So if you are ever trying to collect
       | some unique dataset, it doesn't hurt to reach out to the web devs
       | to see if they can provide a public API.
        
         | abirch wrote:
         | Please consider it an early Christmas present to yourself if
         | you can pay a nominal amount for an API instead of spending
         | your time scraping unless you enjoy doing the scraping.
        
       | Uptrenda wrote:
       | I think whether you 'succeed' or 'fail' on a side project they
       | are still valuable. No matter if you can't finish it or it turns
       | out different to how you imagined -- you get to come away as a
       | better version of yourself. A person who is more optimized for a
       | new strategy. And sometimes 'failure' is a worthwhile price for
       | that ability. Who knows, it might be exactly what prepares you
       | for something even bigger in the future.
        
         | fuzzfactor wrote:
         | I guess the kind of extreme effort that _doesn 't usually have
         | a promising conclusion_ is more common in scientific research,
         | or experimentation in general, but sometimes you just have to
         | get accustomed to it.
         | 
         | Eventually it doesn't really make any difference if there's no
         | breathtaking milestone because it turned out to be impossible
         | by nature, ran out of runway, or lost interest after a more or
         | less valiant attempt.
         | 
         | What can be gained is the strength to overcome the near-
         | impossible next time and all it has to do is be a certain
         | degree less-impossible and you know whether that would take you
         | over the goal line like few others because you've been there.
         | 
         | Without even worrying as much about whether you will lose
         | interest or not, that's a lot less stress and pressure when you
         | think about it.
         | 
         | This can enable you more realistically to succeed in other
         | areas where peers may find it impossible or not be able to do
         | as well without as big an inconclusive project behind them.
        
       | TheGoodBarn wrote:
       | What I love about projects like this is they are dynamic enough
       | to cover a number of interests all in one.
       | 
       | I personally have some side projects that have started as X,
       | transitioned into Y and Z, and then I stole some ideas and built
       | A, which turned to B, which a requirement in my professional job
       | necessitated the Z solution mixed with the B solution and
       | resulted in something else which re-ignited my interest in X and
       | helped me rebuild with a more clear mindset on what I intended in
       | the first place.
       | 
       | All that to say, these things are dynamic and a long list of
       | "failed" projects is a historical narrative of learning and
       | interests over time. I love to see it.
        
       | sota_pop wrote:
       | Nice article OP. I and a great many others suffer from the same
       | struggles of bringing personal projects to "completion", and I've
       | gotta respect the resilience in the length of time you hung in
       | there. However, not to be overly pedantic, but I always felt
       | "data science" was an exploratory exercise to discover insights
       | into a given data set. I always personally filed the efforts to
       | create the pipeline and associated automation (i.e. identify,
       | capture, and store a given data set - more commonly referred to
       | as "ETL") as a "data engineering" task, which these days is
       | considered a different specialty. Perhaps if you scope your
       | problem a little smaller, you may yet be able to capture
       | something demonstrably valuable to others (and something you
       | might consider "finished"). You'd be surprised how simple
       | something that addresses a real issue can be to be able to
       | provide real value for others.
       | 
       | Nice work and great effort.
        
       | sshrajesh wrote:
       | Anyone knows what software is used to create these diagrams:
       | https://lellep.xyz/blog/images/failed_data_science_project/2...
        
         | regular_trash wrote:
         | Excalidraw
        
         | tvrg wrote:
         | Looks like something you could create with excalidraw. It's an
         | awesome tool!
         | 
         | https://excalidraw.com/
        
       | dowager_dan99 wrote:
       | I for one don't want to start counting everything I lose interest
       | in as a "failure", that would be too depressing. I actually think
       | this is a feature not a flaw. You have very few attention tokens
       | and should be aggressive in getting them back.
       | 
       | I think this is very different from the "finishing" decision.
       | That should focus on scope and iterations, while attempting to
       | account for effort vs. reward and avoiding things like sunk cost
       | influences.
       | 
       | Combine both and you've got "pragmatic grit": the ability to get
       | valuable shit done.
        
       ___________________________________________________________________
       (page generated 2024-12-09 23:01 UTC)