Page MenuHomePhabricator

[external] Seed does not have an effect on bootstrap samples
Closed, ResolvedPublic

Assigned To
Authored By
eisenman
Oct 13 2022, 2:44 PM
Referenced Files
F2634971: BS_win1.png
Mar 10 2023, 2:57 PM
F2634972: BS_win2.png
Mar 10 2023, 2:57 PM
F2577817: test_bootstrap_replication.R
Oct 13 2022, 2:44 PM

Description

The issue has been reported on GitHub:
https://github.com/wiesenfa/challengeR/issues/36

These are @wiesenfa's findings:

Parallelization currently leads to different results, but sequential computation not.

The solution is relatively simple though: Replacing in the readme (and vignettes probably)

set.seed(1)

by

> set.seed(1, kind = "L'Ecuyer-CMRG“)

changes the random seed generator and then it also works parallelized. I'd suggest to consider replacing the random seed generator also for sequential computation, either using kind = "L'Ecuyer-CMRG“ or kind = "default“, because otherwise the last generator used is applied until restart. As far as I understand both the default and L’Ecuyer have their pros and cons, but not sure which one is to be preferred for sequential computation.

See rough test routine attached:

It should be noted that parallelized computation will still not yield the same results as after sequential computation, thus to obtain the same results, the same number of cores need to be used. To achieve this, first a stream of random seeds would need to be generated which would have to be applied before each drawing of a bootstrap sample. This would require internal changes of the code.

I’d suggest to include a warning in bootstrap.ranked.list() if the default random generator is used in conjunction with parallelization:

if (RNGkind()[1]!="L'Ecuyer-CMRG" & parallel) warning("To ensure reproducibility please use kind = \"L'Ecuyer-CMRG\" in set.seed(), e.g. set.seed(1, kind = \"L'Ecuyer-CMRG\")“)

Event Timeline

eisenman created this task.
eisenman moved this task from Backlog to In Progress on the challengeR board.

I implemented Manuel's suggestions in branch hotfix/T29361-EnsureReproducibilityWithParallelBootstrapping and added corresponding unit tests to test-bootstrap.R.

All the tests pass on my Linux system, but not on Windows. The test "two parallel bootstrappings yield same results" is failing. Could some of you please test on your systems as well?

I tested it with R 4.2.0 on Windows system and got the same error. I stopped the test and looked rankingBootstrapped1 and rankingBootstrapped2. Here are the screenshots:

BS_win2.png (779×1 px, 69 KB)

BS_win1.png (779×1 px, 62 KB)

The problem is that, bootsrappedRanks and Aggregate lists have significantly different results for random and worstcase data. I think the problem is caused by somewhere in the deep where seeding is implemented. Maybe a dependent library in the operating system is the source.

oh I hate it so much. I know the problem, only Windows is affected. Parallelization does not work with forking there, I keep forgetting this. I'll look for a solution on windows

Could someone please try on Windows

 data <- read.csv(system.file("extdata", "data_matrix.csv", package="challengeR", mustWork=TRUE))

  challenge <- as.challenge(data, by="task", algorithm="alg_name", case="case", value="value", smallBetter=FALSE)

  ranking <- challenge%>%rankThenAggregate(FUN=mean, ties.method="min")

  library(doParallel)

cl <- makePSOCKcluster(2)
registerDoParallel(cl)
parallel::clusterSetRNGStream(cl, iseed=1)
rankingBootstrapped1 <- ranking%>%bootstrap(nboot=10, parallel=TRUE, progress="none")
parallel::clusterSetRNGStream(cl, iseed=1)
rankingBootstrapped2 <- ranking%>%bootstrap(nboot=10, parallel=TRUE, progress="none")
stopImplicitCluster()

expect_equal(rankingBootstrapped1, rankingBootstrapped2)

Oh I HATE it!
Could you please try (first installing package "doRNG" https://cran.r-project.org/web/packages/doRNG/index.html ):

library(challengeR)

data <- read.csv(system.file("extdata", "data_matrix.csv", package="challengeR", mustWork=TRUE))

challenge <- as.challenge(data, by="task", algorithm="alg_name", case="case", value="value", smallBetter=FALSE)

ranking <- challenge%>%rankThenAggregate(FUN=mean, ties.method="min")

library(doParallel)
library(doRNG)
numCores <- detectCores(logical=FALSE)
registerDoParallel(cores=numCores)

registerDoRNG(1)
rankingBootstrapped1 <- ranking%>%bootstrap(nboot=10, parallel=TRUE, progress="none")

registerDoRNG(1)
rankingBootstrapped2 <- ranking%>%bootstrap(nboot=10, parallel=TRUE, progress="none")

stopImplicitCluster()

testthat::expect_equal(rankingBootstrapped1, rankingBootstrapped2)

this should work but requires an additional package... I try to come up with something else in a couple of minutes

using doRNG might be the best version should work on any OS

Great @wiesenfa! The test with doRNG passed on Windows and Ubuntu!

Or we just forbid parallelization with windows... Parallelization of R in Windows is such a series of workarounds....

I introduced the doRNG package to ensure reproducibility on Windows.