Page 42 - Summer2019

P. 42

```
Modeling Therapeutic Ultrasound
steps is reduced. In general, this provides a very practical the numerical algorithm correctly, (2) maximize performance
way to test the accuracy of almost any numerical model. (e.g., reduce run time), and (3) minimize the computational
Reduce the size of the grid spacing (Ax) and the time step resources needed (e.g., memory). Unsurprisingly, the devel-
(At) or otherwise increase the mesh density, and check to opment of eﬂicient high-performance computer code is
see if the answer remains the same. If not, keep reducing Ax closely connected to a deep understanding of the underlying
and At until the answer no longer changes. This procedure computer hardware. This is particularly relevant for models
is called a convergence test and should always be carried out of therapeutic ultrasound where the grid sizes are often
for every modeling scenario. In general, the output from a extremely large and complex calculations such as the FFT
numerical model should not be trusted unless convergence are performed (Iaros et al., 2015).
has been demonstrated!
Computational hardware has undergone rapid changes since
Finite-difference methods have been widely used for model- the first appearance of microprocessors in the late 1950s.
ing in acoustics; however, these methods often require very Huge increases in performance have been enabled by con-
large computational grids to avoid numerical dispersion. tinual improvements in semiconductor lithography leading
To reduce dispersion errors, higher order ﬁnite-difference to a doubling in the number of transistors on a computer
schemes can be implemented that use more neighboring chip approximately every 18 months. During the twentieth
grid points to estimate the spatial and temporal gradients. century, performance increases were also obtained through
Spectral methods take this idea to the limit and use all of the increases in transistor switching frequency These days, how-
grid points simultaneously by ﬁtting a finite sum of basis ever, performance increases are instead driven by increases in
functions to the data. In acoustics, a common choice is to parallelization across all levels of processing along with the
use trigonometric functions, where the ﬁtting is performed development of specialized compute units such as graphics
by taking a fast Fourier transform (FFT). This is the idea processing units (GPUs). This means a modern supercom-
behind the PSTD and k-space methods that calculate spatial puting cluster can be highly heterogeneous, consisting of
gradients in the spatial-frequency doma.in. Although compu- multiple interconnected computers, each potentially con-
tationally more expensive than the FDTD method for a ﬁxed taining multiple central processing units (CPUs) and GPUs,
grid size, these methods can signiﬁcantly reduce dispersion where each CPU and GPU has multiple cores, each of which
errors and thus the number of points per wavelength required can execute multiple instructions simultaneously on mul-
for accurate solutions (Tabei et al., 2002). tiple data points! Similarly, there is hierarchy of local and
remote memory with different storage capacities and access
A remaining challenge for collocation methods computed on speeds. Although these details may not be familiar to many
regular Cartesian grids is the introduction of medium stair- acousticians, they are nonetheless important. Effectively
casing. This arises because the material properties must be programming for such heterogeneous architectures is highly
represented at discrete points in the model (think of the inter- nontrivial and can have a large impact on the performance
section of lines on a sheet of graph paper), and in many cases, and tractability of running therapeutic ultrasound simula-
the material boundaries are not aligned with the grid. This tions (Iaros et al., 2015).
leads to stair-like edges between regions with different mate-
rial properties that generate spurious acoustic reﬂections. For For heterogeneous computer environments, there are two
the PSTD method, this can be the dominant source of error fundamental requirements to consider: data locality and
(Robertson et al., 2017a). Although these errors will reduce workload balance. Data locality is critical because there
with increasing grid density, in some cases, it can be challeng- is huge difference in the transfer speed (20 times slower)
ing to perform a convergence test because the properties are and latency (100 times slower) when accessing data stored
only known at a ﬁxed resolution (e.g., from amedical image), on another interconnected computer compared with data
often of the same order as the acoustic wavelength. stored in local memory (e.g., cache). For the large computa-
tional problems encountered in ultrasound, this means the
Computer Code data must be carefully decomposed into different levels of
Once a numerical method has been developed, this must be memory so that communication is minimized or overlapped
turned into computer code that can be used to perform simu- with other useful calculations. Workload balance is critical
lations. Typically, the high-level goals are to (1) implement because different parts of a heterogeneous system can have
40 | Acaunlilsl 'I'b:Iay| Summer 2019
```