I introduced the doRNG package to ensure reproducibility on Windows.
- Queries
- All Stories
- Search
- Advanced Search
- Transactions
- Transaction Logs
Advanced Search
Aug 11 2023
Aug 10 2023
Aug 9 2023
Mar 31 2023
Or we just forbid parallelization with windows... Parallelization of R in Windows is such a series of workarounds....
Great @wiesenfa! The test with doRNG passed on Windows and Ubuntu!
using doRNG might be the best version should work on any OS
Oh I HATE it!
Could you please try (first installing package "doRNG" https://cran.r-project.org/web/packages/doRNG/index.html ):
Mar 10 2023
Could someone please try on Windows
oh I hate it so much. I know the problem, only Windows is affected. Parallelization does not work with forking there, I keep forgetting this. I'll look for a solution on windows
I tested it with R 4.2.0 on Windows system and got the same error. I stopped the test and looked rankingBootstrapped1 and rankingBootstrapped2. Here are the screenshots:
Feb 23 2023
I implemented Manuel's suggestions in branch hotfix/T29361-EnsureReproducibilityWithParallelBootstrapping and added corresponding unit tests to test-bootstrap.R.
Oct 13 2022
Jun 17 2022
Warning messages when there are missing values in the data were reviewed as below:
Jun 14 2022
Hey everyone,
Jun 9 2022
I like the results when the scales library is used! However, when we find a way to bring back the confidence intervals, also @wiesenfa's latest solution can be used.
Jun 7 2022
Hey everyone,
I added tests in current future branch for checking class of “algorithm” column in challenge object.
Jun 3 2022
First, I tried the fix with R 3.6 and can confirm that it does not break the functionality there.
May 30 2022
I have added object[[algorithm]] <- as.factor(object[[algorithm]]) to challengeR.R as you suggested. Now everything works without any problem. No need of stating stringsAsFactors anymore during CSV read.
May 23 2022
Thank you so much @aekavur ! It helps a lot to understand the reason finally!
May 22 2022
Hi again :)
May 16 2022
if the output is NULL, object[[by]] is not a factor, i.e. class(object[[by]]) is "character", in this case you need to use use unique() and probably your solution
Hi again,
May 13 2022
Thanks Emre!
Thats a weird change. I didn't find any mention in R changelog.
probably instead of
algorithms=factor(unique(object[[by]]))
it will be preferred
Congrats for tracing this down!
Finally, I could find the source of the bug. 😊 It is caused by changed output type of unique() function in R:Base from R-3 to R-4.
Feb 28 2022
I am sharing my current test code with artificial data. Since there can be 4-5 blob plots in the report (depending on data, task number), I need to prepare a new test code for only blob plots. Until that, you may use the code I am sharing.
thanks Emre. that's problematic, confidence intervals are missing. Could you share a code file for testing with artificial data (ideally not with the report as output but the plot itself)? Then I will try to look into it. or is this difficult for you?
I have tried this approach. I just needed to remove minor_breaks=NULL, line since there is no such a config in R/scale-discrete-.r
Feb 24 2022
I think the solution is to consider rank not as continuous but a factor (essentially a string)
That means first following
Feb 21 2022
THanks Emre! This sounds like a lot of effort. Please give me some time to have a look at it
I have tried many configurations just to force ggplot2 to start y-axis labels from "1" when choosing automatic scaling. However, it was not possible :/
Feb 14 2022
I guess overall it's a matter of taste.
Fully automatic one has several problems: in case of the 30 algorithms, scale starts with 0 which is not sensible. I'm not sure what happens with something like 27 or 17 algorithms (a number which doesn't divide by 5). in case of the 7 alogirhtms it starts with 2 which I find a bit weird, I would expect a scale starting with 1. Thus, I would at least include the limits=c(1,max(...)) argument which however as said before may lead to sequences like 1,7,13,... but maybe this is not so much of a problem.
Let's try the automatic config of ggplot :)
If I remember correctly this didn't work layout-wise for large number of algorithms. Numbers will either overlap or need to get very small/size of figure will need to be increased.
try to test with something like 20 algorithms, how does the report look then?
what's the problem with 1,5,10,15,18? the scale isn't affected, so for me it wouldn't matter that it's not the same intervals. in principle you could also omit the 18, i.e. only 1,5,10,15. Instead of all integers, I would rather use the automatic choice.
I agree with you. On the other hand, putting breaks according to a defined integer can be tricky. For example, let's assume that we have decided to define breaks on every 5th element. The y-axis will be a 1,5,10,15,18 for a challenge with 18 algorithms. The last portion of the sequence will have a different period. Therefore, I offer including all integer breaks for the [1, #algorithms] range. I am putting some examples here:
Feb 11 2022
not sure whether this is a good idea. imagine a challenge with 18 algorithms. there will be only a 1 and an 18 and nothing in between, this may make it difficult to read. what do you think?
I have tried suggested codes but they did not fix the problem. Besides, there caused additional issues. :)
Could you try to replace "breaks" by "labels" in
I have tested this with the provided data. The scaling of the y-axis seems to be correct now. But only the first rank is labeled on the y-axis. Can the other ranks be labeled as well?
Feb 8 2022
The problem is almost fixed by giving na.treat parameter in both as.challenge and ranking methods (except rankThenAggregate). Now we can generate reports for all ranking methods.
Feb 7 2022
scale_y_continuous functions inside ./R/Stability.R file were modified. The problem seems solved. You can test it feature/T28966-YaxisOfBlobPlotsAlwaysScaledTo5 branch via the file at the attachment. (You can run it root folder of the challengeR code)
I guess na.treat it is only needed for the line plot for comparing to other ranking methods?
In this case, a message could be thrown when compiling the report saying something like "line plot comparing ranking methods omitted since na.treat is not specified. Specify na.treat in as.challenge() if inclusion of line plot is desired" and allow compilation of the report (excluding line plot).
(Note that you can define na.treat both in as.challenge() as well as in the ranking functions).
Jan 28 2022
Thank you for investigating this! In challengeR it is covered in the way that a message is emitted saying "na.treat obligatory if report is intended to be compiled". In order to solve the mentioned issue 2, a strategy for the preferred way to handle it in VISSART should be defined. Should the user be guided to specify the NaN handling strategy? Should the user be able to generate a report but without the plots that require numeric values?
Current status of the issue:
Nov 29 2021
Nov 16 2021
Oct 28 2021
Oct 15 2021
Sep 29 2021
Successfully checked the proposed fix on a Windows system for downward compatibility (R 3.6.3, ggplot2 3.3.0).
Sep 28 2021
Jul 27 2021
May 7 2021
May 6 2021
May 4 2021
Apr 26 2021
@eisenman the change in develop branch has not been uploaded to GitHub, is this not automatically synchronized?. So the user who reported the bug still has the same problem. It would be good to merge into master as well
Apr 23 2021
now test case in test-report.R
Do you have a minimum example to reproduce this? Would be great to have that in the test checklists as well.
Apr 22 2021
very simple fix in rankingHeatmap.challenge
@eisenman can this be merged into master branch?
Apr 19 2021
Mar 3 2021
In this case we solved it by upgrading BiocManager. Probably the previous version did not know about the later version of "graph".
graph had only been used for networks, I guess version 1.62 is sufficient