Speed tests

The scripts used for these tests can be found in \tests\speed\

The following computers have been used:

Note

The tests here were reduced in the number of rays/samples as compared to real calculations to let them run reasonably quickly. Longer calculations would demonstrate yet bigger difference between the slowest and the fastest cases, as the overheads (job distribution, collecting of histograms and plotting) would become relatively less important.

The tables below show execution times in seconds. Some cells have two values: for Python 2 and for Python 3.

Multithreading and multiprocessing in ray tracing

Script: \tests\speed\1_SourceZCrystalThetaAlpha_speed.py.
The test is based on the example \examples\withRaycing\07_AnalyzerBent2D\01BD_SourceZCrystalThetaAlpha.py.

This script calculates diffraction of a 2D-bent diced crystal anayzer from three types of geometric source.

system

1

multithreading

multiprocessing

2

4

2

4

[1]

Windows 10 64 bit

510.2, 539.2

306.0, 318.5

220.1, 220.6

458.8, 573.4

338.8, 441.2

Ubuntu 16.04 64 bit

436.1, 446.7

257.2, 263.3

183.3, 188.6

246.5, 250.0

157.1, 158.7

[2]

Windows 10 64 bit

359.9, 546.9

290.8, 292.0

218.7, 219.7

362.2, 436.4

290.8, 363.6

Ubuntu 16.10 64 bit

293.3, 293.1

172.3, 173.6

169.5, 168.3

173.3, 172.4

156.1, 154.6

OpenCL performance with Undulator source

Script: \tests\speed\2_synchrotronSources_speed.py.
The test is based on the example \examples\withRaycing\01_SynchrotronSources\synchrotronSources.py.

This script calculates characteristics of an undulator source at energies around one harmonic.

system

no OpenCL

OpenCL on CPU

OpenCL on GPU

[1]

Windows 10 64 bit

1471 1385

36.0 34.1

25.7 23.9

Ubuntu 16.04 64 bit

950 950

34.6 35.4

20.6 21.0

[2]

Windows 10 64 bit

1801 1909

61.0 60.3

126 123

Ubuntu 16.10 64 bit

1245 1255

57.6 60.2

122 127

[9]

local

30.0

ZMQ 1Gb

182.9

OpenCL performance with wave propagation

Script: \tests\speed\3_Softi_CXIw2D_speed.py.
The test is based on the example \examples\withRaycing\14_SoftiMAX\Softi_CXIw2D.py.

This script calculates several consecutive wave propagation integrals from the source down to the focus at the sample. Here, each wave is represented by 2·105 samples and thus each integral considers 4·1010 scattering paths. Such calculations are impossible in numpy and have to be carried out with the help of OpenCL. Even with OpenCL, these calculations are not feasible on low end graphics cards and therefore are not exemplified here on a laptop. Notice also that for the real example a larger number of samples, > 106 (i.e. > 1012 scattering paths), should be opted for better convergence.

system

OpenCL on CPU

OpenCL on GPU

[1]

Windows 10 64 bit

637 635

76.8 76.4

Ubuntu 16.04 64 bit

602 605

71.1 71.0

[3]

1×K80

196

4×K80

76.5

[4]

1×P100

53.0

4×P100

25.8

[5]

Xeon E5-2650 v3

321

[6]

Xeon E5-2650 v4

251

[7]

Xeon Gold 6130

162

[8]

1×A100

17.5

2×A100

11.5

[9]

local

40.4

ZMQ 1Gb

48.4

Summary

  • Except for the case of computing with OpenCL on GPU, calculations in Linux are usually significantly faster than in Windows. Especially when using multithreading or multiprocessing, the execution in Linux is dramatically faster.

  • There is no significant difference in speed between Python 2 and Python 3, except for multiprocessing in Windows, where Python 2 performs better.

  • For geometric ray tracing a decent laptop can be a reasonable choice.