Page MenuHomePhabricator

Boolean Operations in Segmentation utilities are slow
Open, NormalPublic

Assigned To
None
Authored By
isensee
Aug 24 2021, 10:28 AM
Referenced Files
F2417214: tmp3.tif
Aug 24 2021, 10:28 AM
F2417213: tmp2.tif
Aug 24 2021, 10:28 AM

Description

open both files in MITK, open segmentation utilities (not multilabel!)

Then run difference, union or intersection.

Due to the large image size all of these take a long time:
Difference: ~14s
Intersection: ~13.5s
Union: ~12s
(times measured with a stop watch, so +- 0.5s)

I understand that there is some overhead involved with writing back the result into some image and then displaying it, but boolean operations are rather basic stuff and should run WAY faster.

Furthermore MITK uses a an unreasonable amount of RAM to compute the results (compared to a python implementation). That is probably because the target image is unsigned short, even though both source images were unsigned char. Why?

Here is a reference implementation in python:

import numpy as np
import tifffile
from time import time

a = tifffile.imread('tmp2.tif')
b = tifffile.imread('tmp3.tif')

# difference
st = time()
c = np.copy(a)
c[b > 0] = 0
end = time()
print('difference took', end - st, 's')

# union
st = time()
c = ((a > 0) | (b > 0)).astype(np.uint8)
end = time()
print('union took', end - st, 's')

# intersection
st = time()
c = ((a > 0) & (b > 0)).astype(np.uint8)
end = time()
print('intersection took', end - st, 's')

difference took 0.4352085590362549 s
union took 0.5148181915283203 s
intersection took 0.5096790790557861 s

Even though MITK uses multiprocessing the single threaded python/numpy implementation is much faster.

Best,

Fabian

Event Timeline

isensee renamed this task from Boolean Operations in Segmentation utilities are excruciatingly slow to Boolean Operations in Segmentation utilities are slow.Aug 24 2021, 10:31 AM
kislinsk triaged this task as Normal priority.Sep 15 2021, 4:54 AM
kislinsk edited projects, added MITK (v2021.10); removed MITK.
kislinsk added a subscriber: kislinsk.

This could be connected to the undo/redo implementation of MITK and the creation of difference images for the undo stack. I do not know your version of MITK but the use of LZ4 compression for difference images in the latest versions could already be the or part of the solution. I agree that the operations should be much faster but they might be already as fast as the python implementation so whoever is working on this task should clock the different parts of the operation first to see what is actually taking so long and as said, if it is still the case in the latest version of MITK.

Unfortunately still slow. Just checked with 2021-10-01 shapshot on Ubuntu 18.04 (same system as tests above).
13.6s for intersection
14s for difference
didn't bother clocking union

Hi everyone,
I did some basic profiling on this issue to see where the issue might be and how the python vs c++ compute comparison looks like.
TL;DR: MITK computes aren't slow at all. Problem is in conversion of results to LabelSetImage via InitializeByLabeledImage method.

Baselining:
I think the comparison aren't exactly apples to apples. The tmp2.tif is uint16 data type (0,255 pixel values) and, tmp3.tif is uinit8 (0,1 pixel values). So I converted tmp2.tif into uint8 with 0,1 pixel values & created new tif file for comparison. Also, python-numpy compute snippet should be compared to itk filter operations on the data. ie. the profiling should happen in mitk::BooleanOperation::GetDifference, mitk::BooleanOperation::GetUnion, mitk::BooleanOperation::GetIntersection methods.

Machine:
Windows 10, Intel i7-1280P, 32GB RAM; Laptop, connected to power, set to "High Performance" mode.

Python run times:

difference took 1.0259475708007812 s
union took 2.186373472213745 s
intersection took 2.436326742172241 s

MITK runtimes (on release build)=:

Overall wall clock runtimes are on an average 12-15s. That's is including compute, new node creation, rendering etc. But excluding the compute, rest are common denominators.
So when it comes compute performance, for an apples to apples comparison, I measured just the filter processing and the overall time of execution of eg. mitk::BooleanOperation::GetDifference method. The difference being just the final packing of the compute results into an LabelSetImage via InitializeByLabeledImage call. The profiling was done using chrono library for line to line time measurement.

22.40 core.mod.seg.booleanoperation: in GetDifference compute duration: 1249
27.81 core.mod.seg.booleanoperation: in GetDifference total duration: 6657
55.18 core.mod.seg.booleanoperation: in GetUnion compute duration: 1000
60.12 core.mod.seg.booleanoperation: in GetUnion total duration: 5936
66.95 core.mod.seg.booleanoperation: in GetIntersection compute duration: 1033
79.45 core.mod.seg.booleanoperation: in GetIntersection total duration: 13529

As you can see above, the c++ compute times are very much in par with python computes times, if not better!
But still, the overall execution time of GetDifference or GetIntersection are relatively high, thanks to InitializeByLabeledImage method overhead.

Conclusion:
InitializeByLabeledImage is widely used in MITK segmentation areas. It seems to be pretty slow and needs revisiting if possible.