Page MenuHomePhabricator

Check which files are still relevant
Open, NormalPublic

Description

There are still files with unused code where I'm not sure whether they are needed in other scenarios that are not well documented:

  • select.R
  • benchmarkUtils.R
  • winner.R
  • S3.R
  • second.R
  • extract.workflow.R
  • compareRanks.R

Event Timeline

  • compareRanks() allows to compare 2 ranking lists and compute Kendall's tau, would leave it in package
  • benchmarkUtils allows to link with benchmark package (CRAN archived) which has some more features, but is not maintained anymore. might be dropped
  • winner() extracts the winner (first ranked) for each task, might be a simplistic but handy convenience function
ranking=challenge%>%aggregateThenRank(FUN = mean, # aggregation function, 
                                          na.treat=0, # either "na.rm" to remove missing data, 
                                          ties.method = "min" # a character string specifying 
    )  
winner(ranking)
  • second() was similar but is not maintained, drop
  • S3 contains print functions, should be kept (although might not be properly maintained). controls the output if you use
ranking=challenge%>%aggregateThenRank(FUN = mean, # aggregation function, 
                                          na.treat=0, # either "na.rm" to remove missing data, 
                                          ties.method = "min" # a character string specifying 
    )  
ranking
  • extract.workflow() is a convenience function that allows to exract the workflow from one object and do the same workflow on another. Was supposed to have some more functionality, butwould keep it (more interesting if something like
    ranking=challenge%>%rank() %>% aggregate(FUN = mean,  na.treat=0) %>% rank()
    
    workfl <- extract.workflow(ranking)
workfl <- extract.workflow(ranking)
another_challenge_object %>% workfl # do the same to other challenge
  • select.R: select.if() was supposed to allow subsetting of results, e.g.
comp1=compareRanks(a1_mean,a1_median)
# exclude all tasks with 1 or 2 algorithms
#  comp1[sapply(comp1, function(x) nrow(x$mat)>2)]
  comp1%>% select.if(function(x) nrow(x)>2)

In my opinion, the functionality that we want to keep should also have unit tests to (1) indicate that it is maintained and (2) to demonstrate how to use it.

what about
keep S3, compareRanks and extract.workflow
but do not export (i.e. remove from namespace) compareRanks and extract.workflow. they can then only be accessed e.g. by challengeR:::compareRanks(). These might be of practical use.
?

kept select.if(), winner(), extract.workfolow and compareRanks()
and removed everything not supported anymore.
as.warehouse (benchmarkUtils) is not exported, recommend to leave because this may come handy for specific situations

I would suggest to keep it like this, if you feel uncomfortable @eisenman with this we could insert a message "not tested" for these function although like extract.workflow() are ridiculously simple.

Ok, we can keep them in this release.