[HN Gopher] Scipy Lecture Notes
       ___________________________________________________________________
        
       Scipy Lecture Notes
        
       Author : vyuh
       Score  : 285 points
       Date   : 2021-01-08 06:24 UTC (16 hours ago)
        
 (HTM) web link (www.scipy-lectures.org)
 (TXT) w3m dump (www.scipy-lectures.org)
        
       | the_mango wrote:
       | Am I the only one who read - Spicy Lecture Notes ?
        
         | dragonshed wrote:
         | Definitely not the only one. Capitalization matters. For me,
         | the mental transposition is less likely with 'SciPy' than with
         | 'Scipy'
        
         | jagged-chisel wrote:
         | this is the first time I have ever dyslexified SciPy into Spicy
         | and I fear I will never read it correctly again.
        
         | tsjq wrote:
         | Me Too !
        
       | maztaim wrote:
       | I skip-read this as SPICY lecture notes...
        
       | adenozine wrote:
       | What an incredible resource!
       | 
       | It's always great to see well-crafted python resources. It's so
       | easy to get started in python and you can get pretty far without
       | knowing the best ways to do things, so I'm glad there's things
       | like this for newbies.
       | 
       | Maybe in the future, the statistics portion could be expanded.
       | While I'm grateful for all this information, it is rather odd to
       | leave out Bayesian stuff.
       | 
       | As an aside, HN comments with nothing to say except CSS comments
       | is so shameful. Imagine collecting all this information and
       | giving away this catalogue for free and having someone nitpick
       | some silly sidebar zoom functionality. It's honestly despicable
       | how often it happens. I hope the author knows how much this
       | resource helps people out.
        
         | beojan wrote:
         | > As an aside, HN comments with nothing to say except CSS
         | comments is so shameful.
         | 
         | I see two top-level comments of this sort, and they're both at
         | "I can't read it" severity.
        
       | zappo2938 wrote:
       | So .... where do I learn statistics in the first place? Let me
       | rephrase the question. What is the most efficient way to learn
       | the minimum viable amount of statistics?
        
         | cinntaile wrote:
         | An introductory statistics course. I'm sure there are a few of
         | those available online, both as a paid course and as free
         | online university lectures.
        
         | yellowstuff wrote:
         | Think Stats is a very good book aimed at Python programmers who
         | want a broad overview of practical statistical techniques:
         | https://greenteapress.com/thinkstats/html/index.html
        
           | zappo2938 wrote:
           | Thank you for the recommendation. I build admin dashboards
           | using stock (double entendre?) charting libraries and
           | recently have been using my own d3.js visualizations with
           | dynamic content. At this point, I might as well start to
           | delve into data science and have been investing time
           | developing math skills with calculus and linear algebra. I
           | would like to also take some time and learn basic statistics
           | concepts. I want a level up a little bit but don't see the
           | point getting a Ph.D. in machine learning. I only need the
           | basics to start from.
        
             | joshvm wrote:
             | You probably want the second edition:
             | https://greenteapress.com/wp/think-stats-2e/
             | 
             | The first edition PDF 404s for me.
        
         | st1x7 wrote:
         | > What is the most efficient way to learn the minimum viable
         | amount of statistics?
         | 
         | You need to add 3 constraints to the question
         | 
         | 1. What is your starting point and current knowledge of
         | mathematics and statistics?
         | 
         | 2. Minimum viable for what? What do you need the statistics
         | knowledge for?
         | 
         | 3. How much effort can you afford to put into this over what
         | period of time?
         | 
         | Then the answer ranges from "here are a couple of good youtube
         | videos" to "here is how to design your own degree in statistics
         | using freely available material".
        
         | [deleted]
        
       | iagovar wrote:
       | Im my journey through data analytics, what helped me most is to
       | fight with real datasets. Lectures are fine, but you don't really
       | grasp the little details needed to do a proper job until you have
       | messy datasets, very large datasets, have to deal with text in a
       | non-english language, etc.
       | 
       | That's the most useful stuff in my opinion. Courses and lectures
       | include sample data that don't really put you in the position to
       | having no option than optimize your workflow because your box
       | can't deal with it in a reasonable time.
       | 
       | Or when you go crazy because you can't perform some analysis
       | because something somewhere is wrong and your debugger can't help
       | you, and you just want to punch someone in the face.
       | 
       | That's how I discovered that cleaning and preparing data is about
       | 90% of the job, avoid CSV for non-numeric data and use SQLite
       | instead, when possible, the god-send of Knime, etc.
        
         | giu wrote:
         | By real datasets you mean company-specific ones? Or do you
         | happen to have some examples that are openly available which
         | helped you a lot?
         | 
         | I definitely concur with your first point, since I made the
         | same experience, specifically when working with company-
         | specific datasets.
         | 
         | From my experience one also underestimates how much time
         | cleaning up the data takes; there are quite a few steps you
         | need to go through before you can really start to analyze a
         | dataset.
        
           | iagovar wrote:
           | I happen to scrape a lot of large websites (mostly forums
           | currently) and that's messy enough to force you into learning
           | tricks.
           | 
           | I didn't stumble upon into any (tabular, at least) dataset
           | that wasn't very curated.
           | 
           | Keep in mind that I studied sociology, so stuff that is a
           | given for most HN people isn't for me. I had to learn a lot
           | of CSS (for selectors), regex (still hate it), what's OLAP
           | and how to take advantage of it (DuckDB) and a lot of stuff
           | I'm not even aware now.
           | 
           | But I remember taking courses in my Uni, and later on, with R
           | and Python. It was interesting, but no matter how deep into
           | the rabbit hole of weird models I learnt, it felt... IDK,
           | shallow?
           | 
           | Imagine yourself pulling data out of a company ERP, with
           | human filled data. It won't be a walk in the park, just make
           | some logit models and call it a day. You'll spend a lot of
           | time trying to understand what's going on. And then you
           | perform the models or make a dashboard.
        
             | giu wrote:
             | Thanks a lot for your reply!
             | 
             | Scraping websites can be quite the messy business, since
             | some websites change their document structure more often
             | than others.
             | 
             | Nonetheless, it's still a very instructive activity and you
             | can build quite the pipeline around it (scraping multiple
             | websites, joining datasets, efficiently storing the data,
             | etc.).
        
               | iagovar wrote:
               | Yeah, when data piled up I had to think about how to
               | store it, RAM, and a bunch of other things that I didn't
               | have to consider with sample data. Specifically RAM and
               | how to transform data without so much need of it was a
               | concern for some time.
        
               | rohan_shah wrote:
               | I am also currently learning to scrape forums. And I am a
               | philosophy student. Could you point to some resources
               | that helped you learn it better?
        
               | iagovar wrote:
               | Are you looking for something specific? Most tools have
               | documentation you can bang your head against.
        
               | jmt_ wrote:
               | Learning CSS selectors and HTML structure, inspect
               | element and the other dev tools builtin to your browser,
               | and something like BeautifulSoup (for static/non-JS heavy
               | pages) and Selenium (JS and other complicated pages) is
               | pretty key imo. My background in web dev helped me with
               | the HTML stuff. Basically, you fire up the page in a
               | browser, inspect element to see how you can use CSS
               | selectors to uniquely identify that data, then using
               | BeautifulSoup or Selenium to parse and interact with the
               | DOM will cover most web scraping cases.
        
         | pmart123 wrote:
         | This is true, but this helps build data engineering/cleaning
         | skills, which is a different but complementary skill to
         | modeling.
        
       | freakynit wrote:
       | Wow!!!! This is gold. Super useful. Thank you for this :)
        
       | johndoe42377 wrote:
       | This, by the way, should not be called "science". Science is a
       | methodology of establishing aspects of truth (via reproducible
       | experiments).
       | 
       | What it should be called accurately is "modeling". Mostly
       | oversimplified and plainly wrong (like the Bayesian sect or any
       | kind of predictive modeling - look how all covid models and
       | simulations missed everything).
       | 
       | So, it is data modeling, not data science. And it is important to
       | realize and understand the difference.
        
       | enriquto wrote:
       | The section "how does python compare to other solutions" is a bit
       | lackluster, and heavily biased at the same time. It would be more
       | useful if this section was written by proponents of each of the
       | other "solutions".
        
         | reallydontask wrote:
         | I agree with you 100%, maybe offer to write it up for them.
         | 
         | It's hard to write from a different point of view from that
         | which you hold, or at least that's what I find
        
       | rajesht wrote:
       | If you read it spicy, you are not alone. Human brain optimizes by
       | reading first and last letters to the wodrs
        
       | asicsp wrote:
       | Here's more awesome resources:
       | 
       | * Pandas:
       | https://pandas.pydata.org/docs/getting_started/index.html
       | 
       | * DSP: https://greenteapress.com/thinkdsp/html/index.html
       | 
       | * Numpy: https://www.labri.fr/perso/nrougier/from-python-to-
       | numpy/
       | 
       | * Data Carpentry: https://datacarpentry.org/lessons/
       | 
       | * Data science path: https://github.com/ossu/data-science
        
         | mattficke wrote:
         | This is a great list, thank you.
         | 
         | Julia Evans has a Pandas cookbook that's a good complement to
         | the official docs: https://github.com/jvns/pandas-cookbook
        
         | 3do wrote:
         | thank you!
        
         | screature2 wrote:
         | Also, I love the cookbooks that Chris Albon put together
         | https://chrisalbon.com/
         | 
         | I make especially heavy use of the data wrangling recipes
         | everytime I can't figure out how to do something in pandas.
        
         | blewboarwastake wrote:
         | Thank you for sharing. Never heard of Data Carpentry, looks
         | awesome.
        
         | jbencook wrote:
         | This is great! Has anyone read From Python to NumPy?
        
       | complex_pi wrote:
       | Co-editor of the lecture notes here, if someone has a question.
        
       | Bostonian wrote:
       | Does anyone have a book they would recommend over this resource
       | for learning Scipy? Or is this the best place to start?
        
         | mattip wrote:
         | This is the best place to become familiar with the tools and to
         | set the stage for your journey. Then find a problem you want to
         | solve, and find more domain specific resources. Some people
         | learn best from tutorials, some from video, some from courses,
         | some from just banging their heads against a wall till they
         | figure it out.
        
       | [deleted]
        
       | pw6hv wrote:
       | On Firefox, the left table of content pane overlaps with the text
       | thus I cannot see the leftmost part of the paragraphs...
        
         | dguest wrote:
         | If you make the window narrower the content pane disappears, so
         | it's possible to work around this.
        
         | jsinai wrote:
         | Same issue with Safari.
         | 
         | Ignoring that small issue, this is a well-crafted and mature
         | resource, one I wish I had access to 5 years ago! Good job to
         | the authors.
        
         | pw6hv wrote:
         | I think it's fixed now, it was not working correctly before but
         | now the site looks great!
        
         | complex_pi wrote:
         | Can you report on your screen resolution?
        
       | ABeeSea wrote:
       | The sidebar covers the content when zooming. Terrible design for
       | accessibility.
        
         | rajamaka wrote:
         | Same here, but without zooming.
        
         | SiempreViernes wrote:
         | Works fine for me
        
       | sireat wrote:
       | This is an awesome resource but the general Python section could
       | use some work.
       | 
       | I am assuming that target audience are scientists with a modicum
       | of programming knowledge.
       | 
       | The list and especially dictionary section is a bit bare.
       | 
       | In the optimization section have a discussion on when to use
       | lists, dictionaries, tuples and sets. (for example the difference
       | between "needle" in my_list vs "needle" in my_set)
       | 
       | When to use something from collections and when to use ndarray.
       | (the short answer being - it depends)
        
       ___________________________________________________________________
       (page generated 2021-01-08 23:02 UTC)