The group_by verb

The group_by verb

The group_by verb

The summarize verb

(music_top200
  >> filter(_.country == "Japan")
  >> summarize(avg_duration = _.duration.mean()))
avg_duration
0 250.53499

1 rows × 1 columns

Summarizing by country

(music_top200
  >> group_by(_.country)
  >> summarize(avg_duration = _.duration.mean())
)
country avg_duration
0 Argentina 212.847855
1 Australia 204.795300
2 Austria 184.894870
... ... ...
59 United States 190.827500
60 Uruguay 210.796985
61 Viet Nam 217.222830

62 rows × 2 columns

Summarizing by continent and position

(music_top200
  >> group_by(_.continent, _.position)
  >> summarize(
      min_streams = _.streams.min(),
      max_streams = _.streams.max()
  )
)
continent position min_streams max_streams
0 Africa 1 94422 94422
1 Africa 2 74689 74689
2 Africa 3 67552 67552
... ... ... ... ...
997 Oceania 198 44570 225951
998 Oceania 199 44364 225492
999 Oceania 200 44291 225179

1000 rows × 4 columns

Summarizing by continent and position

(music_top200

  >> summarize(
      min_streams = _.streams.min(),
      max_streams = _.streams.max()
  )
)
min_streams max_streams
0 1470 12987027

1 rows × 2 columns

Summarizing by continent and position

(music_top200
  >> filter(_.continent == "Oceania", _.position == 1)
  >> summarize(
      min_streams = _.streams.min(),
      max_streams = _.streams.max()
  )
)
min_streams max_streams
0 321272 1757343

1 rows × 2 columns

Summarizing by continent and position

(music_top200
  >> group_by(_.continent, _.position)
  >> summarize(
      min_streams = _.streams.min(),
      max_streams = _.streams.max()
  )
)
continent position min_streams max_streams
0 Africa 1 94422 94422
1 Africa 2 74689 74689
2 Africa 3 67552 67552
... ... ... ... ...
997 Oceania 198 44570 225951
998 Oceania 199 44364 225492
999 Oceania 200 44291 225179

1000 rows × 4 columns

Let’s practice!