Workshop: Introduction to Siuba (PyData NYC 2023)
siuba is a data analysis library that makes data science faster. It provides a simple, consistent interface that handles messy, real-life data.
In this workshop, you’ll explore data on the top music tracks on Spotify. We’ll cover the 5 key siuba functions that allow you to answer many questions about the popularity of artists in different countries, along with different features in their music.
Throughout the workshop, we’ll use plotnine to graph the data.
Who the heck is teaching?
Outline
- (set up): using github codespaces
- data wrangling: filter, arrange, mutate
- visualization: plotnine basics
- grouping and summarizing data
- additional plot types
Requirements
You should have some familiarity with Python. Some experience with pandas will be useful, but is not necessary!
Preparation
We’ll walk you through these steps:
Using this site
The content for this workshop is broken into about 12 lessons. Each lesson contains slides, followed by exercises.
You can also view all the slides together.