Page MenuHomePhabricator

Report contains misleading information on missing values
Closed, ResolvedPublic


When a data set contains NAs, the user is forced to specify the treatment option when constructing the challenge object. In the generated report, it says there are 0 missing values (probably because they are replaced/removed). This is kind of misleading. The initial number of NAs and the treatment strategy should be stated for clarity.

Also the treatment option should not be forced when constructing the challenge object, if the ranking method rank-then-aggregate is used.

Event Timeline

What do the others think? Which number of NAs should be stated in the report?

I think it is interesting to state the number of NAs in the report but this should be the actual number of NAs (not 0).

I agree, it would be nice if the actual number of NAs would be reported (together with the na.treat method) and not the number after na.treat which is then obviously 0.

now as.challenge() as well as report contain extended information on missings and automatically inserted test cases.
@eisenman this changes the tests for as.challenge. Could you please 1. check whether the messages are well phrased and then 2. adapt tests, then 3. close task. Thanks!

I checked the report and I think the handling of missings is not clear: First saying no observations are missing, followed by algorithm performances are missing. The value of the replacement should not appear in the table.

challengeR-report-missings.PNG (256×1 px, 28 KB)

the first sentence refers to the fact that no missings have been found in the data set. would rephrasing to "0 missing cases entered in the data set have been found" help? the next sentence is about cases which have been inserted by the sanity check. Would adding "However, ..." in between help?
If you have any other suggestions, I would be happy to include them

I like the proposal!
"0 missing cases have been found in the data set. However, performance of not all algorithms has been observed for all cases. Therefor, missings have been inserted in the following cases:"