Skip to content

Function reference

Core verbs - one table

arrange Sort rows based on one or more columns.
count Count observations by group.
distinct Count observations by group.
filter Keep rows that match condition.
head Keep the first n rows of data.
mutate, transmute Create or replace columns.
rename Rename columns.
select Keep, drop, or rename specific columns.
summarize Calculate a single number per grouping.
group_by, ungroup Specify groups for splitting rows of data.

Core verbs - two table

inner_join, left_join, right_join, full_join Mutating joins
semi_join, anti_join Filtering joins

Query verbs

collect Retrieve data into a DataFrame.
show_query Print the query being generated.

Tidy verbs

complete Add rows for missing combinations in the data.
extract Add new columns by matching a pattern on a column of strings.
gather, spread Gather columns in to long format. Spread out to wide format.
pivot_longer, pivot_wider Change rows of data to columns, or columns to rows. More comprehensive than spread and gather.
separate, unite Add new columns by splitting a character column.
nest, unnest Create a column where each entry is a DataFrame.

Column Operations

Forcats

fct_collapse Rename categories. Optionally group all others.
fct_infreq Order categories by frequency (largest first)
fct_inorder Order categories by when they first appear.
fct_lump Lump infrequently observed categories together.
fct_recode Rename categories.
fct_reorder Reordered categories, using a calculation over another column.
fct_rev Reverse category levels.

Datetime

floor_date, ceil_date Round datetimes down or up to a specific granularity (e.g. week).

Vector

between() Check whether values are in a specified range.
case_when(), if_else() Generalized if statements.
coalesce() Use first non-missing element across columns.
cumall(), cumany(), cummean() Cumulative all, any, and mean.
lag(), lead() Shift values later (lag) or earlier (lead) in time.
n() Calculate the number of observations in a vector.
n_distinct() Count the number of unique values.
na_if() Convert a value to NA.
near() Check whether every pair of values in two vectors are close.
nth(), first(), last() Return the first, last, or nth value.
row_number(), ntile(), min_rank(), dense_rank(), percent_rank(), cume_dist() Windowed rank functions.