+4 votes
in Programming Languages by (5.2k points)
I want to find the unique elements in a large R vector. What function should I use?

1 Answer

0 votes
by (33.0k points)
edited by

You can use either the unique() or duplicated() function to find the unique values in any R vector.

The duplicated() function returns FALSE for the first occurrence of any element in the vector; all other occurrences are marked as TRUE, i.e., they are duplicates. You need to use the indices of FALSE to find the unique elements.

Here is an example using both functions.

> a <- c(1,0,11,2,1,5,0,1,2,3,4,5,6,7,1,2,3,4,5,3,1)
> unique(a)
[1]  1  0 11  2  5  3  4  6  7
> duplicated(a)
 [1] FALSE FALSE FALSE FALSE  TRUE FALSE  TRUE  TRUE  TRUE FALSE FALSE  TRUE FALSE FALSE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE
> a[duplicated(a)==FALSE]
[1]  1  0 11  2  5  3  4  6  7

...