diff --git a/README.md b/README.md index efd9552..31cf9a6 100644 --- a/README.md +++ b/README.md @@ -1,108 +1,108 @@ # Probabilistic U-Net -Re-implementation of the model described in `A Probabilistic U-Net for Segmentation of Ambiguous Images' ([NeurIPS 2018 poster](https://arxiv.org/abs/1806.05034)). +Re-implementation of the model described in `A Probabilistic U-Net for Segmentation of Ambiguous Images' ([paper @ NeurIPS 2018](https://arxiv.org/abs/1806.05034)). -This was also a spotlight presentation at NeurIPS and short video on the paper of similar content can be found [here](https://youtu.be/-cfFxQWfFrA) (4min). +This was also a spotlight presentation at NeurIPS and a short video on the paper of similar content can be found [here](https://youtu.be/-cfFxQWfFrA) (4min). The architecture of the Probabilistic U-Net is depicted below: subfigure a) shows sampling and b) the training setup: ![](assets/architecture.png) Below see samples conditioned on held-out validation set images from the (stochastic) CityScapes data set: ![](assets/10_image_16_sample.gif) ## Setup package in virtual environment ``` git clone https://github.com/SimonKohl/probabilistic_unet.git . cd prob_unet/ virtualenv -p python3 venv source venv/bin/activate pip3 install -e . ``` ## Install batch-generators for data augmentation ``` cd .. git clone https://github.com/MIC-DKFZ/batchgenerators cd batchgenerators pip3 install nilearn scikit-image nibabel pip3 install -e . cd prob_unet ``` ## Download & preprocess the Cityscapes dataset 1) Create a login account on the Cityscapes website: https://www.cityscapes-dataset.com/ 2) Once you've logged in, download the train, val and test annotations and images: - Annotations: [gtFine_trainvaltest.zip](https://www.cityscapes-dataset.com/file-handling/?packageID=1) (241MB) - Images: [leftImg8bit_trainvaltest.zip](https://www.cityscapes-dataset.com/file-handling/?packageID=3) (11GB) 3) unzip the data (unzip _trainvaltest.zip) and adjust `raw_data_dir` (full path to unzipped files) and `out_dir` (full path to desired output directory) in `preprocessing_config.py` 4) bilinearly rescale the data to a resolution of 256 x 512 and save as numpy arrays by running ``` cd cityscapes python3 preprocessing.py cd .. ``` ## Training [skip to evaluation in case you only want to use the pretrained model.] modify `data_dir` and `exp_dir` in `scripts/prob_unet_config.py` then: ``` cd training python3 train_prob_unet.py --config prob_unet_config.py ``` ## Evaluation Load your own trained model or use a pretrained model. A set of pretrained weights can be downloaded from [zenodo.org](https://zenodo.org/record/1419051#.W5utoOEzYUE) (187MB). After down-loading, unpack the file via `tar -xvzf pretrained_weights.tar.gz`, e.g. in `/model`. In either case (using your own or the pretrained model), modify the `data_dir` and `exp_dir` in `evaluation/cityscapes_eval_config.py` to match you paths. then first write samples (defaults to 16 segmentation samples for each of the 500 validation images): ``` cd ../evaluation python3 eval_cityscapes.py --write_samples ``` followed by their evaluation (which is multi-threaded and thus reasonably fast): ``` python3 eval_cityscapes.py --eval_samples ``` The evaluation produces a dictionary holding the results. These can be visualized by launching an ipython notbook: ``` jupyter notebook evaluation_plots.ipynb ``` The following results are obtained from the pretrained model using above notebook: ![](assets/validation_results.png) ## Tests The evaluation metrics are under test-coverage. Run the tests as follows: ``` cd ../tests/evaluation python3 -m pytest eval_tests.py ``` ## Deviations from original work The code found in this repository was not used in the original paper and slight modifications apply: - training on a single gpu (Titan Xp) instead of distributed training, which is not supported in this implementation - average-pooling rather than bilinear interpolation is used for down-sampling operations in the model - the number of conv kernels is kept constant after the 3rd scale as opposed to strictly doubling it after each scale (for reduction of memory footprint) - HeNormal weight initialization worked better than a orthogonal weight initialization ## How to cite this code Please cite the original publication: ``` @article{kohl2018probabilistic, title={A Probabilistic U-Net for Segmentation of Ambiguous Images}, author={Kohl, Simon AA and Romera-Paredes, Bernardino and Meyer, Clemens and De Fauw, Jeffrey and Ledsam, Joseph R and Maier-Hein, Klaus H and Eslami, SM and Rezende, Danilo Jimenez and Ronneberger, Olaf}, journal={arXiv preprint arXiv:1806.05034}, year={2018} } ``` ## License The code is published under the [Apache License Version 2.0](LICENSE). \ No newline at end of file