from setup import ___
from siuba import *
from plotnine import *
from music_top200 import music_top200, track_features
Course welcome
Click here to open the slides full screen.
Exercise 1: inspecting music data
Try changing to “United States”. This should return only the top 200 hits from the United States.
(music_top200>> filter(_.country == "Mexico")
)
country | position | track_name | artist | streams | duration | continent | |
---|---|---|---|---|---|---|---|
8200 | Mexico | 1 | Safaera | Bad Bunny | 7948565 | 295.177 | Americas |
8201 | Mexico | 2 | Si Veo a Tu Mamá | Bad Bunny | 7535381 | 170.972 | Americas |
8202 | Mexico | 3 | La Difícil | Bad Bunny | 5459673 | 163.084 | Americas |
... | ... | ... | ... | ... | ... | ... | ... |
8397 | Mexico | 198 | Vete Ya | Valentín Elizalde | 559399 | 157.760 | Americas |
8398 | Mexico | 199 | El Mundo a Tus Pies | Grupo Firme | 559065 | 176.227 | Americas |
8399 | Mexico | 200 | Verte Ir | DJ Luian | 558925 | 267.500 | Americas |
200 rows × 7 columns
Which artist has a track in the second position on the United States charts?
Roddy Ricch
Try again. That artist is in the first position.Lil Uzi Vert
That’s right!Halsey
Try again. That artist is the second from last position (198).Exercise 2: inspecting track_features
data
Use the options below, to examine tracks by different artists. Can you find the options that order tracks from highest energy to lowest?
# The Weeknd
# Bad Bunny
# Roddy Ricch
# Justin Bieber
#_.popularity
#-_.popularity
#_.energy
#-_.energy
(track_features>> arrange(-_.popularity)
>> filter(_.artist == "The Weeknd")
)
artist | album | track_name | energy | valence | danceability | speechiness | acousticness | popularity | duration | |
---|---|---|---|---|---|---|---|---|---|---|
24982 | The Weeknd | After Hours | In Your Eyes | 0.719 | 0.7170 | 0.667 | 0.0346 | 0.00285 | 91 | 237.520 |
9283 | The Weeknd | After Hours | After Hours | 0.572 | 0.1430 | 0.664 | 0.0305 | 0.08110 | 84 | 361.027 |
24688 | The Weeknd | Starboy | Starboy | 0.587 | 0.4860 | 0.679 | 0.2760 | 0.14100 | 84 | 230.453 |
... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
17383 | The Weeknd | Heartless | Heartless | 0.750 | 0.1980 | 0.531 | 0.1110 | 0.00632 | 60 | 200.080 |
3888 | The Weeknd | After Hours (Deluxe) | Nothing Compares - Bonus Track | 0.577 | 0.0398 | 0.524 | 0.0358 | 0.00253 | 49 | 222.307 |
22430 | The Weeknd | After Hours (Deluxe) | Missed You - Bonus Track | 0.364 | 0.4480 | 0.716 | 0.0866 | 0.10700 | 48 | 144.540 |
23 rows × 10 columns
Exercise 3: Plotting track features
Here is one kind of plot you will learn to make in the course.
# - popularity
# - acousticness
# - danceability
(track_features>> filter(_.artist == "The Weeknd")
>> ggplot(aes("energy", "valence", size = "popularity", color = "album", label = "track_name"))
+ geom_point()
+ geom_text(nudge_y = .05, size = 10)
)