Gregory J. Deckler on 21 Apr 2016 00:52:48
Provide the option of not removing duplicates automatically when creating R visualizations or provide the ability to create R datasets using the same syntax as shown in the comments when creating an R visualization
- Comments (14)
RE: "R" Don't remove duplicates
This seem to be under review for two and a half year now. Any update?
RE: "R" Don't remove duplicates
yes, it should not be there. The histogram I have created is not proper one.
RE: "R" Don't remove duplicates
Please allow to keep duplicates. They are also part of historical data, need it in our forecasting in power BI
RE: "R" Don't remove duplicates
I strongly agree - I should be possible to turn off remove duplicates as an option.
RE: "R" Don't remove duplicates
Please do the same for Python!
RE: "R" Don't remove duplicates
Microsoft always adds useless features that always are dumb. The very least, for this stupid feature, is to have it disable-possible.
RE: "R" Don't remove duplicates
The work around for this is the use of an Index column. However, this does not work if the data is coming from different tables!!! I would have to create a new table with a index column for that new table containing the variables of interest. This would defeat the purpose of using a rational data structure and other aspects.
RE: "R" Don't remove duplicates
Why does it remove duplicates by default? When performing univariate qualitative analysis, I want to be able to drop in a single qualitative field. This means I WANT duplicates and having keys complicates the analysis meaninglessly.
That automatic removal should be made an option. I believe that it was added because of the limitation of R scripts in Power BI to 150k rows. The removal of duplicates by Power BI (in what I assume to be some sort of pre-processor directive like call judging by the invalid code syntax shows in the editor) probably helps mitigate that limitation in certain types of data sets. Unfortunately, without the option to turn off that "pre-processor" like call, an entire segment of potential analysis is complicated or even impossible (if the original data set has no key).
RE: "R" Don't remove duplicates
I have no idea why this feature isn't standard behaviour - it's trivial in R to remove duplicates from a dataset if that behaviour is desired
RE: "R" Don't remove duplicates
I want to use R to create a histogram. I add one column, and then it removes all the duplicates, which provides a completely inaccurate histogram. This is pretty stilly - and potentially problematic if someone uses this without noticing.
Yes, I can add extra columns, or create an ID column, but I don't want to. I don't want the program to remove duplicates, just because it sees fit. There are times when it isn't appropriate - and as the analyst I want that choice.
Also, I want to write the simplest code, and that should involve only one column for a histogram.