Page MenuHomePhabricator

wrong text in report on algorithm subsets
Open, HighPublic

Description

algorithmAttribute <- attr(challenge_multiple, "algorithm")
 totalNumberOfAlgorithms <- length(levels(attr(object$data, "missingData")[[1]][[algorithmAttribute]]))

 cat("The top ",
     length(levels(challenge_multiple[[1]][[attr(challenge_multiple, "algorithm")]])),
     " out of ",
     totalNumberOfAlgorithms,
     " algorithms are considered.\n")
  • this will also appear if all algorithms are shown
  • totalNumberOfAlgorithms does not count the number of algorithms but the number of replaced missing cases

Event Timeline

wiesenfa triaged this task as High priority.Thu, Sep 24, 11:32 AM
wiesenfa created this task.
  1. This is on purpose as I wanted to avoid further nested ifs, but can be discussed.
  2. This was the only variable where I found all algorithm factors. How can they be accessed now? fulldata is not working anymore after you latest changes.
  1. I think it will be confusing if it writes "top 5 of 5 algorithms", I would put an if
  2. which changes? as.challenge does not have a full data attribute. The changes in subset have been done by you, see my comments in T27685.

currently It says "The top 0 out of 0 algorithms are considered." in my latest example

currently It says "The top 0 out of 0 algorithms are considered." in my latest example

Could you please provide the steps to reproduce this?

set.seed(4)
strip=runif(n,.9,1)
c_ideal=cbind(task="c_ideal",
              rbind(
                      data.frame(alg_name="A1",value=runif(n,.9,1),case=1:n),
                      data.frame(alg_name="A2",value=runif(n,.8,.89),case=1:n),
                      data.frame(alg_name="A3",value=runif(n,.7,.79),case=1:n),
                      data.frame(alg_name="A4",value=runif(n,.6,.69),case=1:n),
                      data.frame(alg_name="A5",value=runif(n,.5,.59),case=1:n)
              ))
challenge=as.challenge(c_ideal, 
                       algorithm="alg_name", case="case", value="value",
                       smallBetter = T)
ranking=challenge%>%aggregateThenRank(FUN = "mean", # aggregation function, 
                                      na.treat="na.rm", # either "na.rm" to remove missing data, 
                                      ties.method = "min" # a character string specifying 
)

report(ranking,
       format = "PDF", 
       colors=viridis::inferno,# format can be "PDF", "HTML" or "Word"
       latex_engine="pdflatex", #LaTeX engine for producing PDF output. Options are "pdflatex", "lualatex", and "xelatex"
       clean=TRUE, #optional. Using TRUE will clean intermediate files that are created during rendering.
       open=T
)

Ok, weird. For me it says: "The top 5 out of 5 algorithms are considered."

eisenman moved this task from Backlog to In Progress on the challengeR (v1.0) board.
  1. I think it will be confusing if it writes "top 5 of 5 algorithms", I would put an if
  2. which changes? as.challenge does not have a full data attribute. The changes in subset have been done by you, see my comments in T27685.
  1. It will be several ifs at several locations if we want to have the text consistent with the multi-task reports.
  2. I had the changes on how the challenge object is set up for access in report generation in mind. I will investigate this.
  1. it won't be more that I needed for the missing data description I guess;-) I still think that this sentence should be avoided if no subset is in fact used

Ok, weird. For me it says: "The top 5 out of 5 algorithms are considered."

your result is strange given the fact that not the correct things are actually counted

Ok, weird. For me it says: "The top 5 out of 5 algorithms are considered."

your result is strange given the fact that not the correct things are actually counted

I think I only got half of it ;)
The first part is correct, right? So it should say "The top 5" and not "The top 0" also in your case.

eisenman moved this task from In Progress to Done on the challengeR (v1.0) board.Thu, Sep 24, 5:15 PM

It's important to mention that the subset of algorithms should be drawn from the final ranking to avoid wrong results. So if bootstrapping should be performed, create the subset from the bootstrapped ranking, not from the initial ranking that is passed to perform bootstrapping.