distinct
distinct(__data, *args, *, _keep_all=False, **kwargs)
    Keep only distinct (unique) rows from a table.
Parameters:
| Name | Type | Description | Default | 
|---|---|---|---|
__data | 
        The input data.  | 
        required | |
*args | 
        Columns to use when determining which rows are unique.  | 
        () | 
      |
_keep_all | 
        Whether to keep all columns of the original data, not just *args.  | 
        False | 
      |
**kwargs | 
        If specified, arguments passed to the verb mutate(), and then being used in distinct().  | 
        {} | 
      
Examples:
>>> from siuba import _, distinct, select
>>> from siuba.data import penguins
>>> penguins >> distinct(_.species, _.island)
     species     island
0     Adelie  Torgersen
1     Adelie     Biscoe
2     Adelie      Dream
3     Gentoo     Biscoe
4  Chinstrap      Dream
Use _keep_all=True, to keep all columns in each distinct row. This lets you peak at the values of the first unique row.
>>> small_penguins = penguins >> select(_[:4])
>>> small_penguins >> distinct(_.species, _keep_all = True)
     species     island  bill_length_mm  bill_depth_mm
0     Adelie  Torgersen            39.1           18.7
1     Gentoo     Biscoe            46.1           13.2
2  Chinstrap      Dream            46.5           17.9
Source code in siuba/dply/verbs.py
          @singledispatch2(DataFrame)
def distinct(__data, *args, _keep_all = False, **kwargs):
    """Keep only distinct (unique) rows from a table.
    Parameters
    ----------
    __data:
        The input data.
    *args:
        Columns to use when determining which rows are unique.
    _keep_all:
        Whether to keep all columns of the original data, not just *args.
    **kwargs:
        If specified, arguments passed to the verb mutate(), and then being used
        in distinct().
    See Also
    --------
    count : keep distinct rows, and count their number of observations.
    Examples
    --------
    >>> from siuba import _, distinct, select
    >>> from siuba.data import penguins
    >>> penguins >> distinct(_.species, _.island)
         species     island
    0     Adelie  Torgersen
    1     Adelie     Biscoe
    2     Adelie      Dream
    3     Gentoo     Biscoe
    4  Chinstrap      Dream
    Use _keep_all=True, to keep all columns in each distinct row. This lets you
    peak at the values of the first unique row.
    >>> small_penguins = penguins >> select(_[:4])
    >>> small_penguins >> distinct(_.species, _keep_all = True)
         species     island  bill_length_mm  bill_depth_mm
    0     Adelie  Torgersen            39.1           18.7
    1     Gentoo     Biscoe            46.1           13.2
    2  Chinstrap      Dream            46.5           17.9
    """
    if not (args or kwargs):
        return __data.drop_duplicates().reset_index(drop=True)
    new_names, df_res = _mutate_cols(__data, args, kwargs)
    tmp_data = df_res.drop_duplicates(new_names).reset_index(drop=True)
    if not _keep_all:
        return tmp_data[new_names]
    return tmp_data