Code Snippets 🔖

R: Remove duplicate rows in a data frame – Snippet #2

Discover how to remove duplicate rows in a data frame with R

Olivier Simard-Casanova

Apr 15, 2022 — 1 min read

Packages

This snippet requires dplyr.

With the Tidyverse:

library(tidyverse)

Without the Tidyverse:

library(dplyr)

Code

To remove duplicate rows in a data frame, use distinct(). Duplicate rows are rows that are perfectly identical.

With the pipe operator:

new_df <- df %>%
  distinct()

Without the pipe operator:

new_df <- distinct(df)

The code above removes all perfectly identical rows in df.

Resources

Keep distinct/unique rows — distinct

Keep only unique/distinct rows from a data frame. This is similar to unique.data.frame() but considerably faster.

dplyr

Let's say connected with a new audio format - Plus #1

A new format so that you don't miss any of the behind-the-scenes stories of my popular science work.

In the United States, Threads already has more daily users than X

Matt Navarra on Threads, citing a Business Insider article, reports that in the United States, Threads already has more daily users than X, formerly Twitter. In April 2024, Threads has 28 million daily active users, X has 22 million. Threads has 100 million monthly active users, X has 140 million.

2024 has already broken multiple temperature records

Carbon Brief has published a comprehensive article containing multiple visualizations of global temperature data. The visualizations show that 2024 has already broken multiple temperature records. As shown in Figure 1, the first four months of 2024 have seen the largest temperature anomaly since the 1940s. The temperature anomaly is the

Joe Biden's campaign is raising a lot more money than Donald Trump's campaign

Figure 1 is an interesting plot shared on Bluesky by Jacob T. Levy. It shows the cumulative total of funds raised by the campaigns of Donald Trump (in red) and Joe Biden (in blue), in 2020 (dotted line) and 2024 (solid line), between 600 days before Election Day and Election