wrong text in report on algorithm subsets
Closed, ResolvedPublic
Actions

Assigned To

Authored By

	wiesenfa
	Sep 24 2020, 11:32 AM

Description

algorithmAttribute <- attr(challenge_multiple, "algorithm")
 totalNumberOfAlgorithms <- length(levels(attr(object$data, "missingData")[[1]][[algorithmAttribute]]))

 cat("The top ",
     length(levels(challenge_multiple[[1]][[attr(challenge_multiple, "algorithm")]])),
     " out of ",
     totalNumberOfAlgorithms,
     " algorithms are considered.\n")

this will also appear if all algorithms are shown
totalNumberOfAlgorithms does not count the number of algorithms but the number of replaced missing cases

Related Objects

Mentioned In: rCHALLENGER595046260c52: Merge branch 'feature/T27774-FixReportingOfAlgorithmSubset' into develop
Mentioned Here: T27685: changes in subset()

Event Timeline

wiesenfa triaged this task as High priority.Sep 24 2020, 11:32 AM

wiesenfa created this task.

This is on purpose as I wanted to avoid further nested ifs, but can be discussed.
This was the only variable where I found all algorithm factors. How can they be accessed now? fulldata is not working anymore after you latest changes.

I think it will be confusing if it writes "top 5 of 5 algorithms", I would put an if
which changes? as.challenge does not have a full data attribute. The changes in subset have been done by you, see my comments in T27685.

currently It says "The top 0 out of 0 algorithms are considered." in my latest example

In T27774#210733, @wiesenfa wrote:

currently It says "The top 0 out of 0 algorithms are considered." in my latest example

Could you please provide the steps to reproduce this?

set.seed(4)
strip=runif(n,.9,1)
c_ideal=cbind(task="c_ideal",
              rbind(
                      data.frame(alg_name="A1",value=runif(n,.9,1),case=1:n),
                      data.frame(alg_name="A2",value=runif(n,.8,.89),case=1:n),
                      data.frame(alg_name="A3",value=runif(n,.7,.79),case=1:n),
                      data.frame(alg_name="A4",value=runif(n,.6,.69),case=1:n),
                      data.frame(alg_name="A5",value=runif(n,.5,.59),case=1:n)
              ))
challenge=as.challenge(c_ideal, 
                       algorithm="alg_name", case="case", value="value",
                       smallBetter = T)
ranking=challenge%>%aggregateThenRank(FUN = "mean", # aggregation function, 
                                      na.treat="na.rm", # either "na.rm" to remove missing data, 
                                      ties.method = "min" # a character string specifying 
)

report(ranking,
       format = "PDF", 
       colors=viridis::inferno,# format can be "PDF", "HTML" or "Word"
       latex_engine="pdflatex", #LaTeX engine for producing PDF output. Options are "pdflatex", "lualatex", and "xelatex"
       clean=TRUE, #optional. Using TRUE will clean intermediate files that are created during rendering.
       open=T
)

Ok, weird. For me it says: "The top 5 out of 5 algorithms are considered."

In T27774#210731, @wiesenfa wrote:

I think it will be confusing if it writes "top 5 of 5 algorithms", I would put an if

which changes? as.challenge does not have a full data attribute. The changes in subset have been done by you, see my comments in T27685.

It will be several ifs at several locations if we want to have the text consistent with the multi-task reports.
I had the changes on how the challenge object is set up for access in report generation in mind. I will investigate this.

it won't be more that I needed for the missing data description I guess;-) I still think that this sentence should be avoided if no subset is in fact used

In T27774#210741, @eisenman wrote:

Ok, weird. For me it says: "The top 5 out of 5 algorithms are considered."

your result is strange given the fact that not the correct things are actually counted

In T27774#210754, @wiesenfa wrote:

In T27774#210741, @eisenman wrote:

Ok, weird. For me it says: "The top 5 out of 5 algorithms are considered."

your result is strange given the fact that not the correct things are actually counted

I think I only got half of it ;)
The first part is correct, right? So it should say "The top 5" and not "The top 0" also in your case.

eisenman mentioned this in rCHALLENGER595046260c52: Merge branch 'feature/T27774-FixReportingOfAlgorithmSubset' into develop.Sep 24 2020, 5:09 PM

It's important to mention that the subset of algorithms should be drawn from the final ranking to avoid wrong results. So if bootstrapping should be performed, create the subset from the bootstrapped ranking, not from the initial ranking that is passed to perform bootstrapping.

eisenman closed this task as Resolved.Dec 20 2020, 9:28 PM

wrong text in report on algorithm subsetsClosed, ResolvedPublicActions

Description

Related Objects

Event Timeline

wrong text in report on algorithm subsets
Closed, ResolvedPublic
Actions