https://datapythonista.me/blog/ Close Menu Blog Logo Menu datapythonista blog about me Page 1 / 11 Older Posts pandas 2.0 and the Arrow revolution (part I) Marc Garcia | Fri 17 February 2023 Introduction At the time of writing this post, we are in the process of releasing pandas 2.0. The project has a large number of users, and it's used in production quite widely by personal and corporate users. This large use based forces us to be... comments pandas with hundreds of millions of rows Marc Garcia | Thu 22 September 2022 The problem We want to find out which are the top #5 American airports with the largest average (mean) delay on domestic flights. Data We will be using the Data Expo 2009: Airline on time data dataset from the Harvard Dataverse. The data consists... comments Successful delivery of data projects Marc Garcia | Sun 01 December 2019 This week we organized a round table with people involved in the management of data projects. Mostly data science. The idea came after the Executives at PyData session organized earlier this year. And discussions with few people on the challenges... comments An update on the pandas documentation Marc Garcia | Thu 28 November 2019 Some context This post is mainly a technical post on what's the status of the pandas documentation. But let me provide a bit of context on where this comes from. It's a personal opinion, but I think pandas is one of the clearest examples of how... comments New pandas workflow Marc Garcia | Sun 17 November 2019 Some exciting news. After some years of organizing sprints, and maintaining open source, I've been thinking on a more efficient workflow for projects with high volume of activity, like pandas. An exaggerated example would be that I want to create... comments Dataframe summit @ EuroSciPy write up Marc Garcia | Wed 11 September 2019 Last week took place in Bilbao, Spain, EuroSciPy 2019. This year we introduced the maintainers track a room dedicated to discussions among maintainers. The idea is similar to the birds of a feather or unconference sessions of other conferences.... comments pandas: The two cultures Marc | Mon 22 July 2019 Leo Breiman was a distinguished statistician at UC Berkeley, known among other things for his major contributions to CART (decision trees), and ensemble techniques, mainly bootstrap aggregation. Combining both, he was able to define one of the... comments Setting up Fedora Marc Garcia | Wed 05 December 2018 Today I've got my new Dell XPS (with Ubuntu preinstalled), and this is the procedure to set it up, and get my perfect working environment. This is expected to be useful mainly for my future self, but sharing it here in case someone else can find... comments Useful git commands Marc Garcia | Thu 08 November 2018 While git is surely one of my favorite tools, and increases my productivity in a sometimes unbelivable way (like when working on 3 or 5 features at the same time), some times there are operations that can be a bit tricky. There are plenty of git... comments Blog moved Marc Garcia | Sat 08 September 2018 It's been a while since I wanted to move my blog out of blogger. Today I finally did it. :) print('hello world (from Pelican)') This new blog uses Pelican, and is hosted on GitHub pages. Which will let me create blog posts by simply using... comments Page 1 / 11 Older Posts Theme Attila Published with Pelican