[HN Gopher] SQL for data scientists in 100 queries
       ___________________________________________________________________
        
       SQL for data scientists in 100 queries
        
       Author : Anon84
       Score  : 93 points
       Date   : 2024-02-06 21:50 UTC (1 hours ago)
        
 (HTM) web link (gvwilson.github.io)
 (TXT) w3m dump (gvwilson.github.io)
        
       | hobs wrote:
       | Don't forget the venerable SQLZoo - I have referred a zillion
       | people to it over the years.
       | https://www.sqlzoo.net/wiki/SQL_Tutorial
       | 
       | Edit: also an inaccuracy that's minor but can bite you if you're
       | not careful - they mention temporary tables are in memory not on
       | disk - that's not true in almost all sql databases, they are just
       | connection specific eg they don't persist after you disconnect.
       | 
       | Some databases are optimized for a temp table to be a throwaway,
       | but that can be a good or bad thing depending on the use case.
        
         | vavooom wrote:
         | I am also a fan of Mode's SQL Tutorial: https://mode.com/sql-
         | tutorial
        
       | vavooom wrote:
       | Learning outcomes:
       | 
       | * Explain the difference between a database and a database
       | manager.
       | 
       | * Write SQL to select, filter, sort, group, and aggregate data.
       | 
       | * Define tables and insert, update, and delete records.
       | 
       | * Describe different types of join and write queries that use
       | them to combine data.
       | 
       | * Use windowing functions to operate on adjacent rows.
       | 
       | * Explain what transactions are and write queries that roll back
       | when constraints are violated.
       | 
       | * Explain what triggers are and write SQL to create them.
       | 
       | * Manipulate JSON data using SQL.
       | 
       | * Interact with a database using Python directly, from a Jupyter
       | notebook, and via an ORM.
        
       | petalmind wrote:
       | > left outer join
       | 
       | > A join that is guaranteed to keep all rows from the first
       | (left) table. Columns from the right table are filled with actual
       | values if available or with null otherwise.
       | 
       | This wording only works for identity equality join condition. It
       | creates misleading mental model of left joins, and unfortunately
       | is very common.
        
         | erehweb wrote:
         | Can you elaborate with a gotcha example?
        
           | hobs wrote:
           | I assume they mean that row multiplication can occur but
           | otherwise not sure.
        
         | nomilk wrote:
         | I'm not sure I understand, I think this definition still works
         | for left outer joins on conditions other than identity
         | equality, since joins on, say, inequalities or multiple
         | conditions would still be "guaranteed to keep all rows from the
         | first (left) table. Columns from the right table are filled
         | with....".
        
       | shubhamjain wrote:
       | Shameless Plug: If anyone here wants to practice their SQL, they
       | are welcome to try my Mac app: TextQuery [1]. I built it because
       | I wanted to quickly import CSV datasets and run SQL queries on
       | them. I don't think there could be a more fun way to learn SQL
       | than to jump in and start analyzing thousands of public datasets.
       | Sure, you can use CLI/Code as well, but GUI is often faster and
       | easier. Currently, the app is in the beta period and free-to-use.
       | When launched, you'll get to keep the latest beta version.
       | 
       | [1]: https://textquery.app/
        
       | atseajournal wrote:
       | Glad to have learned about iif() from this!
        
         | webdoodle wrote:
         | It's really handy in conjunction with GROUP BY.
        
       | carabiner wrote:
       | Also StrataScratch.com for leetcode for SQL.
       | 
       | What really distinguishes an SQL master is working with queries
       | hundreds of lines long, and query optimization. For example, you
       | can often make queries faster by taking out joins and replacing
       | them with window functions. It's hard to practice these
       | techniques outside of a legit enterprise dataset with billions of
       | rows (maybe a good startup idea).
        
       | pama wrote:
       | I've found that chatGPT is excellent in helping with generating
       | and testing SQL queries.
        
       | nomilk wrote:
       | The tutorial can be downloaded here:
       | https://github.com/gvwilson/sql-tutorial/raw/main/sql-tutori...
       | 
       | (gives the penguins.db file necessary for the examples)
        
       | dinkleberg wrote:
       | Thanks for sharing this! Learn by example resources can be super
       | helpful.
        
       | nomilk wrote:
       | Before flights with patchy/no wifi, I often download a long,
       | single-page tutorial. This is perfect. Curious if anyone knows of
       | any for other languages/tech (e.g. beyond SQL).
        
       | SilverBirch wrote:
       | The queries are solid, but I really appreciate throwing in the
       | worlds most confusing diagrams here and there. It keeps me alert
       | trying to find where to even start with them.
        
       | QAFred123 wrote:
       | Testing
        
       ___________________________________________________________________
       (page generated 2024-02-06 23:00 UTC)