[HN Gopher] Scipy Lecture Notes
___________________________________________________________________
Scipy Lecture Notes
Author : vyuh
Score : 285 points
Date : 2021-01-08 06:24 UTC (16 hours ago)
(HTM) web link (www.scipy-lectures.org)
(TXT) w3m dump (www.scipy-lectures.org)
| the_mango wrote:
| Am I the only one who read - Spicy Lecture Notes ?
| dragonshed wrote:
| Definitely not the only one. Capitalization matters. For me,
| the mental transposition is less likely with 'SciPy' than with
| 'Scipy'
| jagged-chisel wrote:
| this is the first time I have ever dyslexified SciPy into Spicy
| and I fear I will never read it correctly again.
| tsjq wrote:
| Me Too !
| maztaim wrote:
| I skip-read this as SPICY lecture notes...
| adenozine wrote:
| What an incredible resource!
|
| It's always great to see well-crafted python resources. It's so
| easy to get started in python and you can get pretty far without
| knowing the best ways to do things, so I'm glad there's things
| like this for newbies.
|
| Maybe in the future, the statistics portion could be expanded.
| While I'm grateful for all this information, it is rather odd to
| leave out Bayesian stuff.
|
| As an aside, HN comments with nothing to say except CSS comments
| is so shameful. Imagine collecting all this information and
| giving away this catalogue for free and having someone nitpick
| some silly sidebar zoom functionality. It's honestly despicable
| how often it happens. I hope the author knows how much this
| resource helps people out.
| beojan wrote:
| > As an aside, HN comments with nothing to say except CSS
| comments is so shameful.
|
| I see two top-level comments of this sort, and they're both at
| "I can't read it" severity.
| zappo2938 wrote:
| So .... where do I learn statistics in the first place? Let me
| rephrase the question. What is the most efficient way to learn
| the minimum viable amount of statistics?
| cinntaile wrote:
| An introductory statistics course. I'm sure there are a few of
| those available online, both as a paid course and as free
| online university lectures.
| yellowstuff wrote:
| Think Stats is a very good book aimed at Python programmers who
| want a broad overview of practical statistical techniques:
| https://greenteapress.com/thinkstats/html/index.html
| zappo2938 wrote:
| Thank you for the recommendation. I build admin dashboards
| using stock (double entendre?) charting libraries and
| recently have been using my own d3.js visualizations with
| dynamic content. At this point, I might as well start to
| delve into data science and have been investing time
| developing math skills with calculus and linear algebra. I
| would like to also take some time and learn basic statistics
| concepts. I want a level up a little bit but don't see the
| point getting a Ph.D. in machine learning. I only need the
| basics to start from.
| joshvm wrote:
| You probably want the second edition:
| https://greenteapress.com/wp/think-stats-2e/
|
| The first edition PDF 404s for me.
| st1x7 wrote:
| > What is the most efficient way to learn the minimum viable
| amount of statistics?
|
| You need to add 3 constraints to the question
|
| 1. What is your starting point and current knowledge of
| mathematics and statistics?
|
| 2. Minimum viable for what? What do you need the statistics
| knowledge for?
|
| 3. How much effort can you afford to put into this over what
| period of time?
|
| Then the answer ranges from "here are a couple of good youtube
| videos" to "here is how to design your own degree in statistics
| using freely available material".
| [deleted]
| iagovar wrote:
| Im my journey through data analytics, what helped me most is to
| fight with real datasets. Lectures are fine, but you don't really
| grasp the little details needed to do a proper job until you have
| messy datasets, very large datasets, have to deal with text in a
| non-english language, etc.
|
| That's the most useful stuff in my opinion. Courses and lectures
| include sample data that don't really put you in the position to
| having no option than optimize your workflow because your box
| can't deal with it in a reasonable time.
|
| Or when you go crazy because you can't perform some analysis
| because something somewhere is wrong and your debugger can't help
| you, and you just want to punch someone in the face.
|
| That's how I discovered that cleaning and preparing data is about
| 90% of the job, avoid CSV for non-numeric data and use SQLite
| instead, when possible, the god-send of Knime, etc.
| giu wrote:
| By real datasets you mean company-specific ones? Or do you
| happen to have some examples that are openly available which
| helped you a lot?
|
| I definitely concur with your first point, since I made the
| same experience, specifically when working with company-
| specific datasets.
|
| From my experience one also underestimates how much time
| cleaning up the data takes; there are quite a few steps you
| need to go through before you can really start to analyze a
| dataset.
| iagovar wrote:
| I happen to scrape a lot of large websites (mostly forums
| currently) and that's messy enough to force you into learning
| tricks.
|
| I didn't stumble upon into any (tabular, at least) dataset
| that wasn't very curated.
|
| Keep in mind that I studied sociology, so stuff that is a
| given for most HN people isn't for me. I had to learn a lot
| of CSS (for selectors), regex (still hate it), what's OLAP
| and how to take advantage of it (DuckDB) and a lot of stuff
| I'm not even aware now.
|
| But I remember taking courses in my Uni, and later on, with R
| and Python. It was interesting, but no matter how deep into
| the rabbit hole of weird models I learnt, it felt... IDK,
| shallow?
|
| Imagine yourself pulling data out of a company ERP, with
| human filled data. It won't be a walk in the park, just make
| some logit models and call it a day. You'll spend a lot of
| time trying to understand what's going on. And then you
| perform the models or make a dashboard.
| giu wrote:
| Thanks a lot for your reply!
|
| Scraping websites can be quite the messy business, since
| some websites change their document structure more often
| than others.
|
| Nonetheless, it's still a very instructive activity and you
| can build quite the pipeline around it (scraping multiple
| websites, joining datasets, efficiently storing the data,
| etc.).
| iagovar wrote:
| Yeah, when data piled up I had to think about how to
| store it, RAM, and a bunch of other things that I didn't
| have to consider with sample data. Specifically RAM and
| how to transform data without so much need of it was a
| concern for some time.
| rohan_shah wrote:
| I am also currently learning to scrape forums. And I am a
| philosophy student. Could you point to some resources
| that helped you learn it better?
| iagovar wrote:
| Are you looking for something specific? Most tools have
| documentation you can bang your head against.
| jmt_ wrote:
| Learning CSS selectors and HTML structure, inspect
| element and the other dev tools builtin to your browser,
| and something like BeautifulSoup (for static/non-JS heavy
| pages) and Selenium (JS and other complicated pages) is
| pretty key imo. My background in web dev helped me with
| the HTML stuff. Basically, you fire up the page in a
| browser, inspect element to see how you can use CSS
| selectors to uniquely identify that data, then using
| BeautifulSoup or Selenium to parse and interact with the
| DOM will cover most web scraping cases.
| pmart123 wrote:
| This is true, but this helps build data engineering/cleaning
| skills, which is a different but complementary skill to
| modeling.
| freakynit wrote:
| Wow!!!! This is gold. Super useful. Thank you for this :)
| johndoe42377 wrote:
| This, by the way, should not be called "science". Science is a
| methodology of establishing aspects of truth (via reproducible
| experiments).
|
| What it should be called accurately is "modeling". Mostly
| oversimplified and plainly wrong (like the Bayesian sect or any
| kind of predictive modeling - look how all covid models and
| simulations missed everything).
|
| So, it is data modeling, not data science. And it is important to
| realize and understand the difference.
| enriquto wrote:
| The section "how does python compare to other solutions" is a bit
| lackluster, and heavily biased at the same time. It would be more
| useful if this section was written by proponents of each of the
| other "solutions".
| reallydontask wrote:
| I agree with you 100%, maybe offer to write it up for them.
|
| It's hard to write from a different point of view from that
| which you hold, or at least that's what I find
| rajesht wrote:
| If you read it spicy, you are not alone. Human brain optimizes by
| reading first and last letters to the wodrs
| asicsp wrote:
| Here's more awesome resources:
|
| * Pandas:
| https://pandas.pydata.org/docs/getting_started/index.html
|
| * DSP: https://greenteapress.com/thinkdsp/html/index.html
|
| * Numpy: https://www.labri.fr/perso/nrougier/from-python-to-
| numpy/
|
| * Data Carpentry: https://datacarpentry.org/lessons/
|
| * Data science path: https://github.com/ossu/data-science
| mattficke wrote:
| This is a great list, thank you.
|
| Julia Evans has a Pandas cookbook that's a good complement to
| the official docs: https://github.com/jvns/pandas-cookbook
| 3do wrote:
| thank you!
| screature2 wrote:
| Also, I love the cookbooks that Chris Albon put together
| https://chrisalbon.com/
|
| I make especially heavy use of the data wrangling recipes
| everytime I can't figure out how to do something in pandas.
| blewboarwastake wrote:
| Thank you for sharing. Never heard of Data Carpentry, looks
| awesome.
| jbencook wrote:
| This is great! Has anyone read From Python to NumPy?
| complex_pi wrote:
| Co-editor of the lecture notes here, if someone has a question.
| Bostonian wrote:
| Does anyone have a book they would recommend over this resource
| for learning Scipy? Or is this the best place to start?
| mattip wrote:
| This is the best place to become familiar with the tools and to
| set the stage for your journey. Then find a problem you want to
| solve, and find more domain specific resources. Some people
| learn best from tutorials, some from video, some from courses,
| some from just banging their heads against a wall till they
| figure it out.
| [deleted]
| pw6hv wrote:
| On Firefox, the left table of content pane overlaps with the text
| thus I cannot see the leftmost part of the paragraphs...
| dguest wrote:
| If you make the window narrower the content pane disappears, so
| it's possible to work around this.
| jsinai wrote:
| Same issue with Safari.
|
| Ignoring that small issue, this is a well-crafted and mature
| resource, one I wish I had access to 5 years ago! Good job to
| the authors.
| pw6hv wrote:
| I think it's fixed now, it was not working correctly before but
| now the site looks great!
| complex_pi wrote:
| Can you report on your screen resolution?
| ABeeSea wrote:
| The sidebar covers the content when zooming. Terrible design for
| accessibility.
| rajamaka wrote:
| Same here, but without zooming.
| SiempreViernes wrote:
| Works fine for me
| sireat wrote:
| This is an awesome resource but the general Python section could
| use some work.
|
| I am assuming that target audience are scientists with a modicum
| of programming knowledge.
|
| The list and especially dictionary section is a bit bare.
|
| In the optimization section have a discussion on when to use
| lists, dictionaries, tuples and sets. (for example the difference
| between "needle" in my_list vs "needle" in my_set)
|
| When to use something from collections and when to use ndarray.
| (the short answer being - it depends)
___________________________________________________________________
(page generated 2021-01-08 23:02 UTC)