diff --git a/inst/appdir/characterizationOfAlgorithmsBootstrapping.Rmd b/inst/appdir/characterizationOfAlgorithmsBootstrapping.Rmd index 64a6bc2..6f68a62 100644 --- a/inst/appdir/characterizationOfAlgorithmsBootstrapping.Rmd +++ b/inst/appdir/characterizationOfAlgorithmsBootstrapping.Rmd @@ -1,71 +1,74 @@ ### Ranking stability: Ranking variability via bootstrap approach A blob plot of bootstrap results over the different tasks separated by algorithm allows another perspective on the assessment data. This gives deeper insights into the characteristics of tasks and the ranking uncertainty of the algorithms in each task. <!-- 1000 bootstrap Rankings were performed for each task. --> <!-- Each algorithm is considered separately and for each subtask (x-axis) all observed ranks across bootstrap samples (y-axis) are displayed. Additionally, medians and IQR is shown in black. --> <!-- We see which algorithm is consistently among best, which is consistently among worst, which vary extremely... --> \bigskip ```{r blobplot_bootstrap_byAlgorithm,fig.width=7,fig.height = 5} #stabilityByAlgorithm.bootstrap.list if (n.tasks<=6 & n.algorithms<=10 ){ stabilityByAlgorithm(boot_object, ordering=ordering_consensus, max_size = 9, size=4, shape=4, single = F) + scale_color_manual(values=cols) + guides(color = FALSE) } else { pl=stabilityByAlgorithm(boot_object, ordering=ordering_consensus, max_size = 9, size=4, shape=4, single = T) for (i in 1:length(pl)) print(pl[[i]] + scale_color_manual(values=cols) + guides(size = guide_legend(title="%"),color="none") ) } ``` <!-- Stacked frequencies of observed ranks across bootstrap samples are displayed with colouring according to subtask. Vertical lines provide original (non-bootstrap) rankings for each subtask. --> \newpage An alternative representation is provided by a stacked frequency plot of the observed ranks, separated by algorithm. Observed ranks across bootstrap samples are displayed with coloring according to the task. For algorithms that achieve the same rank in different tasks for the full assessment data set, vertical lines are on top of each other. Vertical lines allow to compare the achieved rank of each algorithm over different tasks. \bigskip ```{r stackedFrequencies_bootstrap_byAlgorithm,fig.width=7,fig.height = 5} if (n.tasks<=6 & n.algorithms<=10 ){ stabilityByAlgorithm(boot_object, ordering=ordering_consensus, stacked = TRUE, single = F) } else { pl=stabilityByAlgorithm(boot_object, ordering=ordering_consensus, stacked = TRUE, - single = T) + single = T) %++% + theme(legend.position = ifelse(n.tasks>20, + yes = "bottom", + no = "right")) print(pl) } ```