In T28491: [Segmentation] Slice interpolation 'confirm for all slices' sometimes crahes the duration of 2d-interpolations of large segmentations was reduced by an order of magnitude. However, there's potential for another order of magnitude, since the do/undo mechanism involved in applying the interpolation (or any other segmentation tool result) is very slow.
One part of the issue (or 2/3 of the time that is spent) is related to the zlib compression of difference images. This could be greatly reduced by using a suitable modern compression algorithm that is optimized for compression speed, since the data we are talking about is compression-friendly anyways.
Since the bottleneck is memory access in this case, using such a modern compression algorithm could be even faster than doing a plain block copy of uncompressed data.
I suggest LZ4 for this purpose.
The other part of the issue (1/3 of the time that is currently spent) is related to applying the difference image to a segmentation. This can be potentially accelerated with multi-threading.