Page MenuHomePhabricator

Represent single-task challenge as multi-task challenge with one task
Closed, ResolvedPublic


Single- and multi-task challenges are currently handled separately in different data structures. This results in duplicate production code (also different report templates) and also the testing effort is doubled. To decrease the maintenance effort, the idea is to represent a single-task challenge as a multi-task challenge with one task. This can be realized in the challenge class. In the challenge constructor an argument for a task name can be forced when the argument "by" specifying a multi-task challenge is missing. So the single-task challenge will also show up in an appropriate way in the multi-task challenge report.

Do you also think this makes sense? Am I missing something?

Event Timeline

eisenman created this task.

might indeed reduce complexity, however,

  • many functions need to be adapted requiring some care
  • behavior is sometimes by purpose different, e.g. there are plot titles with the task name in multi-task challenges while there is none in single task challenges
  • many visualizations apply only to multi-task challenges and trying to use them in single task challenges throws an error, this would be needed to be handled
  • also reports for multi task challenges contain more visualizations which would be uninformative for single task challenges (could however be handled by checking the number of tasks internally)
  • a workaround would be necessary adding a task column to single class challenges (with the same label in every row)

Thank you for your feedback!

To your 2nd point: If a single-task challenge containing a task column is loaded as a multi-task challenge, the plots will also have a title. Do you think this should be avoided? Or configurable? If the figures are exported as separate files, it's better to have the titles in the plots, so the end user does not get confused with the plots, right?

To your 5th point: This could be realized with the additional argument mentioned in the description.

I think this should be well thought through before putting into action

Now I know what you were meaning with all your points ;)
I started to work on this during our hacking days on this branch. I managed to transform a single-task data set into multi-task data set including one task. I had to adapt some plot code. And for the reports many duplicate changes would have been necessary. So on the way, I decided to refactor the generation of the reports to lower the future maintenance effort: I extracted the report sections to separate files and include them depending on the variables isMultiTask and bootstrappingEnabled. So each text block and code snippet exists only once. And it's still possible to get four types of reports (single-task data set with and without bootstrapping, multi-task data set with and without bootstrapping). I compared the reports with the reference reports I generated from the master branch. They look the same apart from that the plots for single-task data sets are also labeled with the task name. The next step would be a general clean-up of the functionality that was used for single-task data sets.

It would be great if you could have a look at the changes and see whether all this goes in the right direction! As long it is not merged into the develop branch it is still a suggestion!

In a single task situation, in practice a task will not have a name, so there should be no title and there should not be the need to set a task name I think....
I'll have a look at it, but please give me some time

Something important I think is also, that all plot functions work outside of the report as intended (it is desirable that users can also create their own reports). This includes choosing the correct function and giving an error if a function does not work with single tasks, e.g.

From my side, the migration is completed now. As discussed in the meeting on Monday, the name of a task in a single-task data set is optional. I created test cases for the plot functions that return plot objects, separate issues are created for those who don't. The extraction of top performing algorithms and subsets of tasks from rankings is also migrated and tested. I calculated the package test coverage to identify code that is never executed and deleted what seemed obsolete to me. There are still files with unused code where I'm not sure whether they are needed in other scenarios that are not well documented:

  • select.R
  • benchmarkUtils.R
  • winner.R
  • S3.R
  • second.R
  • extract.workflow.R
  • compareRanks.R

It would be great if someone can have a look at the changes!

Thanks @aguilera for extensive testing of report generation. The residual issues from this task will be handled in T27494.