Skip to main content

Power BI

Under Review

"R" Don't remove duplicates

Vote (78) Share
Gregory J. Deckler's profile image

Gregory J. Deckler on 21 Apr 2016 00:52:48

Provide the option of not removing duplicates automatically when creating R visualizations or provide the ability to create R datasets using the same syntax as shown in the comments when creating an R visualization

Comments (14)
Gregory J. Deckler's profile image Profile Picture

d6e133db d7a0-4dac-a021-1c9ce8cd2820 on 05 Jul 2020 22:46:21

RE: "R" Don't remove duplicates

When linking to an SQL Server Analysis Services database cube, I don't directly have access to the right keys that make records unique to block duplicates from being removed. Therefore, when using cubes it may not be possible to get accurate data in R in some cases. Many statistics/visualizations are worthless when duplicates have been removed. Can someone explain why removing duplicates was ever a good idea?

Gregory J. Deckler's profile image Profile Picture

d6e133db d7a0-4dac-a021-1c9ce8cd2820 on 05 Jul 2020 22:45:03

RE: "R" Don't remove duplicates

I shouldn't have to add an ID or key field to get all the data. If I want to remove duplicates in R, it's trivial with the "unique" statement.

Gregory J. Deckler's profile image Profile Picture

ec1ee083 149b-4e26-8661-a5a60d3b6dc8 on 05 Jul 2020 22:40:04

RE: "R" Don't remove duplicates

The workaround here is to add "ID column" to the data (don't use it in R script)

Gregory J. Deckler's profile image Profile Picture

a9528410 5438-48b6-a4bd-227f31374d70 on 05 Jul 2020 22:19:58

RE: "R" Don't remove duplicates

The comments of an R visualization show:
#dataset <- data.frame(Column)

However, I cannot use the same syntax to create my own data frame.