Probabilistic diffusion tractography with multiple fibre orientations: What can we gain?

T.E.J. Behrens; H. Johansen Berg; S. Jbabdi; M.F.S. Rushworth; M. W. Woolrich

doi:10.1016/j.neuroimage.2006.09.018

Neuroimage. Author manuscript; available in PMC 2021 Jan 12.

Published in final edited form as:

Neuroimage. 2007 Jan 1; 34(1): 144–55.

Published online 2006 Oct 27. doi: 10.1016/j.neuroimage.2006.09.018

PMCID: PMC7116582

EMSID: EMS109748

PMID: 17070705

Probabilistic diffusion tractography with multiple fibre orientations: What can we gain?

T.E.J. Behrens,^†^‡ H. Johansen Berg,^† S. Jbabdi,^† M.F.S. Rushworth,^†^‡ and M. W. Woolrich^†

Author information Copyright and License information PMC Disclaimer

Associated Data

Supplementary Materials: Appendix.
EMS109748-supplement-Appendix.pdf (92K)
GUID: 814C2A30-2471-48CD-ACF4-608F512F0090

Abstract

We present a direct extension of probabilistic diffusion tractography to the case of multiple fibre orientations. Using automatic relevance determination, we are able to perform online selection of the number of fibre orientations supported by the data at each voxel, simplifying the problem of tracking in a multi-orientation field. We then apply the identical probabilistic algorithm to tractography in the multi- and single-fibre cases in a number of example systems which have previously been tracked successfully or unsuccessfully with single-fibre tractography. We show that multi-fibre tractography offers significant advantages in sensitivity when tracking non-dominant fibre populations, but does not dramatically change tractography results for the dominant pathways.

Diffusion tensor tractography can reconstruct many white matter bundles that can be seen in post-mortem human dissection [1, 2] and inferred from ex-vivo tracer studies in macaque [3, 4]. However, there are many examples of fibre pathways that are reported in dissection and tracer studies that are typically absent in diffusion tensor tractography studies. For example, when tracking from the cerebral peduncle or internal capsule, tractography studies tend only to find projections to the medial portion of the sensori-motor strip (e.g. [1, 5, 6] and many others), missing entirely the projections to lateral portions. These lateral regions, which appear unconnected to the peduncle contain the arm representation which is known to have cortico-spinal connections in many primate species [7]. There are many further examples of pathways which are known to exist, but are not reported in the tractography literature. A notable omission from the thalamic parcellation study presented in [3] is a projection from the medial geniculate nucleus (mgn) of thalamus to the primary auditory cortex. In [8] the authors were able to find subcortical sites projecting to different portions of parietal lobe, but were unable to find the known parietal connections with premotor cortex and frontal eye fields that travel down the 1^st and 2^nd portions of the superior longitudinal fasciculus (SLF), despite several different attempted strategies. However, more lateral regions in the parietal lobe were shown to be connected to ventral premotor cortex via the 3^rd portion of the SLF.

It would be tempting to suggest that, with increasing spatial resolution in our diffusion tensor images (DTI), we will see more and more of these previously invisible fibre bundles. Howevever, these examples, and many others, exist at a spatial resolution that is available to DTI tractography. The lateral portions of the cortico-spinal and cortico-bulbar tracts, and the medial sections of the superior longitudinal fasciculus are far bigger pathways than many that we are able to see. The explanation for this apparent dichotomy can be found at a more local scale. Several recent techniques [9, 10, 11, 12, 13, 14, 15, 16] have been proposed that aim to describe local diffusion within an imaging voxel without imposing the Gaussian constraint central to the diffusion tensor model[17]. These techniques have been able to estimate multiple dominant diffusion orientations within the same imaging voxel and suggest that they correspond to multiple fibre populations. In doing so, they have revealed large regions of white matter in which a single dominant orientation does not adequately describe the local diffusion profile. Included in these regions are all of the elusive pathways described above: The lateral motor pathways are obscured by the 3^rd (lateral) branch of the SLF, the acoustic radiations from the mgn are obscured by the optic radiations, and the medial 1^st portion of the SLF is obscured by the medial cortico-spinal fibres.

Attempts have been made to include such complex fibre information into both deterministic [18] and probabilistic [15, 19] tractography techniques, with promising early results including pathways from cerebral peduncle to the entire motor strip. In this manuscript, we present a direct extension of a previously published probabilistic tractography routine [20] to the case of multiple fibre orientations in each voxel. We propose an online Bayesian method for assessing the most appropriate number of fibre orientations for the data at each voxel, and carry out probabilistsic tractography through these complex orientation fields. Because this multi-fibre approach has a direct equivalent in the single fibre case, we are able to assess the advantages of increasing the local model complexity, while other factors remain constant. We find that, in general, the dominant pathway from a region remains the same between single- and multi-fibre approaches, but we show a number of examples where the multi-fibre approach is more sensitive to secondary or subordinate pathways.

1. Methods

1.1. Signal model

We use the model described in [20, 15]. It is a partial volume model, where the diffusion-weighted MR signal is split into an infinitely anisotropic component for each fibre orientation, and a single isotropic component. Unlike [20], here, we will infer on multiple fibre orientations.

The predicted signal for each diffusion-weighted measurement at each voxel is:

\begin{array}{l} S_{i} = S_{0} ((1 - \sum_{j = 1}^{N} f_{j}) exp (- b_{i} d) \\ + \sum_{j = 1}^{N} f_{j} exp (- b_{i} d r_{i}^{T} R_{j} {AR}_{j}^{T} r_{i})) \end{array}

(1)

where S ₀ is the non-diffusion-weighted signal value, d is the diffusivity, b_i and r_i are the b-value and gradient direction associated with the i ^th acquisition, and f_j and $R_{j} {AR}_{j}^{T}$ are the fraction of signal contributed by, and anisotropic diffusion tensor along, the j^th fibre orientation (θ_j,ϕ_j), and N is the maximum number of fibres. That is, A is fixed as:

A = (\begin{matrix} 1 & 0 & 0 \\ 0 & 0 & 0 \\ 0 & 0 & 0 \end{matrix}),

(2)

and R _j rotates A to (θ_j, ϕ_j). The noise is modeled separately for each voxel as independently identically distributed (iid) Gaussian. with a mean of zero and standard deviation across acquisitions of σ.

As in [20], we will use Bayesian estimation to fit the parameters of this model to the signal at each voxel. However, there is a slight twist. We only want to infer multiple fibre populations when there is evidence in the data that they exist. In voxels which truly only support a single fibre orientation (such as medial callosal voxels), fitting a more complex model to the data may lead to poor estimation of the true fibre orientation and its uncertainty, and will, in any case, provide practical problems for tractography. Second fibre orientations will be followed when they do not truly exist. We solve this problem using a Bayesian trick known as automatic relevance determination (ARD), or shrinkage priors, which were originally devised in the field of Neural Networks [21], but have since been used in neuroimaging [22, 23].

1.2. Automatic Relevance Determination

Automatic relevance determination is different from other model selection techniques, as it does not fit different models to the data separately, and compare them on the basis of a metric reflecting data fit and model complexity. Instead, ARD fits the more complex model, but ensures that parameters that are not supported by the data, do not contribute to the likelihood (by ensuring that, in the posterior distribution, they take the value zero with very low variance). ARD can be applied to many different parameters within a model, and each ARD will act independently on its own particular parameter. A major advantage of ARD is that it only require a single model to be fit to the data (as opposed to fitting every candidate model and comparing). Each parameter that is subject to ARD is then selected or deselected (forced to zero) depending on whether it is supported in the data.

Technically, this is achieved by placing a prior distribution on a parameter in a Bayesian model, which will force that parameter to take the value zero if, and only if, there is no evidence in the data for its existence. This prior distribution can take a number of forms. However, the most common of these is a Gaussian distribution with mean zero, but unknown variance. The variance is then estimated as a further parameter. If there is no evidence in the data for the existence of the original parameter, this variance term will be estimated very small, forcing the original parameter to zero. However, if the original parameter is supported by the data, the extra variance term will be estimated large, allowing the original parameter freedom to take any value.

To understand how this works, we must consider the energy functions in a Bayesian estimation scheme. The posterior energy (or negative log probability) is computed as the sum of the energies from the prior and likelihood distributions. If the original parameter does not contribute to the data, it will not reduce the likelihood energy, the prior energy will take over, the new variance will shrink to zero and, with it, so will the original parameter. The trade-off is between the potential reduction in likelihood energy available from the better fit to the data afforded by the extra parameter, and the potential reduction in prior energy available from having low variance when the parameter is close to zero. Note that ARD cannot work in the context of point estimation schemes (for example, finding the parameters which maximise the joint posterior probability), as there is a singularity when the prior variance is 0 (generating infinite negative energy). ARD relies on the marginalisation over hyperparameters that is built in to any fully Bayesian estimation scheme. Many of the equations in the remainder of this section are expanded and explained in the appendix.

Toy example

To illustrate the use of ARD, we will show its behaviour in the context of fitting a mean to some data. The parameter which may or may not be included in the model is the mean itself. In this simple case, we would not need to use ARD, as we can compute the true Bayesian evidence for the two candidate models, with and without the mean. This evidence tells us the probability of the existence of the extra parameter given the data. However, we can then use an ARD, and compare its results with the gold standard approach, which weights the two models by their Bayesian evidence. ^¹

In our toy example, our two candidate models for the perfect Bayesian model averaging approach are:

M0:

\begin{array}{l} P (Y_{i} | σ) & \sim & N (0, σ) \\ P (σ) & \sim & σ^{- 1} \end{array}

(3)

M1:

\begin{matrix} P (Y_{i} |, μ, σ) & \sim & N (μ, σ) \\ P (σ) & \sim & σ^{- 1} \\ P (μ) & \sim & U (- \infty, \infty) \end{matrix}

(4)

where Y_i is the i^th data point and μ and σ are the mean and standard deviation to be estimated. U(−∞, ∞) denotes a uniform distribution. Note that here, and throughout this manuscript, priors on variance parameters are reference priors P(σ) ~ σ ^-1(see [24] for a description of reference priors).

Our interest is in the posterior distribution on the mean, μ, which we can arrive at by averaging the models, weighted by their Bayesian evidence. Note that to arrive at this equation, we have had to compute the Bayesian joint posterior distribution for each model, and then marginalise over the variance parameter (the elemnts in this equation are explained in the appendix).

P (μ | Y) = \frac{δ (μ = 0) P (M 0 | Y) + P (μ | Y, M 1) P (M 1 | Y)}{P (M 0 | Y) + P (M 1 | Y)}

(5)

The ARD modelling approach can be written as follows:

\begin{array}{l} P (Y_{i} |, μ, σ) \sim N (μ, σ) \\ P (σ) \sim σ^{- 1} \\ P (μ | σ_{μ}) \sim N (0, σ_{μ}) \\ P (σ_{μ}) \sim σ_{μ}^{- 1} \end{array}

(6)

Note that for the ARD case, we do not need a metric to select between models, nor do we need explicitly to average the models using their posterior probability given the data. The ARD is a prior on μ, that will force μ to zero if it is not supported by the data, hence it will not contribute to the likelihood function. If there is doubt about whether μ is supported, then the posterior distribution on μ will have weight at zero, and weight at the most likely value for μ. The ARD will implicitly average the models with and without μ with the appropriate weights.

In each case, we can perform analytical integration over all variance parameters leaving us with the posterior distribution of the mean, μ given the data, Y (this integration is showin in the appendix). Figure 1 (a) shows this distribution for data comprising 40 randomly drawn samples from a distribution with true mean, 0.5, and true variance 1. In this case, the model evidence suggests a probability of 0.93 that the more complex model is favoured. The posterior distribution for the ARD model has two peaks. One is at zero (reflecting the probability of the simpler model) and another is at a non-zero value (reflecting the probability of the more complex model). The non-zero peak accounts for 0.84 of the weight of the distribution (compared to the true probability of 0.93 for the complex model). The slight difference between the two approaches demonstrates the approximate nature of the ARD solution.

An external file that holds a picture, illustration, etc.
Object name is EMS109748-f001.jpg

Open in a separate window

Figure 1

(a) Marginal posterior distribution on μ for true model averaging (red), and ARD approximation (black). (b) Effective prior distribution given by standard Gaussian ARD (red), and range-limited Beta ARD (black) in the range [0 1]

ARD on constrained parameters

We need to consider one further subtlety before applying an ARD prior to the volume fraction parameters: These parameters must be constrained to be between 0 and 1. Therefore, instead of using a Gaussian distribution with an unknown variance, we use a Beta distribution with mode at zero and an unknown width (see appendix for a description of the beta distribution).

P (f_{i} | η) \sim β (1, η)

(7)

P (η) \sim η^{- 1}

(8)

where, again, we use a reference prior for η [24]. We can integrate over η to give us the effective prior distribution on f_i, giving us:

P (f_{i}) = \frac{1}{(1 - f_{i}) log (1 - f_{i})}

(9)

Similarly, we can integrate out the variance parameter from a standard Gaussian ARD prior (from equations 7). giving:

P (μ) = \frac{1}{| μ |}

(10)

These two distributions can be compared in figure 1 (b). The question of model comparison becomes relevant when f_i is small; that is when the i^th fibre population is accounting for only a small proportion of the data. Note the similar behaviour in the priors in this region.

1.3. Model Priors

We use ARD on the parameters representing the volume fractions of all but the first fibre, $f_{2}^{N}$ (where $_{2}^{N}$ implies all fibres from 2 to N). This means that if there is no evidence for a second or third fibre orientation in the data, the volume fraction attributed to these fibres will be forced to zero. We do not apply an ARD to the first fibre in order to ensure that we are directly comparable with the previously published single fibre algorithm. i.e. that any increased sensitivity we see in the results can be solely attributed to an increase in complexity in local modelling.

The prior distributions on model variables are therefore as follows:

P (S_{0}) \sim U (0, \infty)

(11)

P (f_{1}) \sim U (0, 1)

(12)

P (f_{2}^{n}) \sim A R D

(13)

P (θ_{1}^{n}) \sim sin (θ)

(14)

P (ϕ_{1}^{n}) \sim U (0, 2 π)

(15)

P (σ) \sim σ^{- 1}

(16)

For details about these priors, see [20]. We also apply the further constraint that $\sum_{i = 1}^{N} f_{i} < 1$ . This constraint is applied easily in Markov Chain Monte Carlo, by rejecting any sample in which the constraint is not met.

1.4. Model estimation

We perform model estimation using Metropolis Hastings Markov Chain Monte Carlo sampling, after analytically marginalising over variance parameters, as in [20]. We burn in for 2000 jumps, then run the chain for a further 1000 jumps sampling every 20. Initialisation is by the log-linear diffusion tensor fit as in [20] unless any preprocessed neighbouring voxel has more than one surviving fibre orientation, in which case parameters are initialised according to the mean values from this neighbouring voxel. At the end of this procedure, we have samples from the posterior probability distribution on every parameter in the model, including the orientation and volume fraction parameters from each fibre population.

2. Probabilistic tractography in a multi-fibre field

We perform tractography using the same sampling scheme that was shown to sample from the global probability of connection in [20], and has also been used in [25, 26, 5, 27].

This scheme amounts to streamline tractography except, at each step, instead of progressing along the most likely principal diffusion direction, we draw a sample from the posterior distribution on principal diffusion directions and progress along this same direction. This streamline becomes a sample from the connectivity distribution, and we draw a large number of samples. Two streamline samples arriving at the same point in space will choose different samples from the posterior distribution on local orientation at that voxel, and therefore leave along different directions, hence accounting for the uncertainty in local fibre orientation. After drawing a large number of samples, we are able to compute the probability of the dominant streamline passing through any single voxel or region, by counting the number of streamlines that passed through that region and dividing by the total number of samples drawn.

The only adaption we need to make this procedure amenable to the multi-fibre case is, at each step, to choose from which population we draw the orientation sample. There are a number of available options here. Particularly appealing, given the modelling approach above, would be to draw a sample from a whole set of fibre orientations, and choose each orientation, i, with a probability proportional to f_i at that sample. However, in a crossing fibre region, this scheme will preferentially sample the dominant fibre, independent of the previous orientation of the streamline. Instead, we choose a scheme that aims to maintain the orientation of the streamline, as this will allow us to track non-dominant pathways through crossing regions. Again, we draw a single sample from a set of fibre orientations, we examine each fibre population to confirm that the volume fraction has not been forced to zero by automatic relevance determination (we threshold at 0.05). We then draw a sample from each remaining population (i.e. those have been judged to be supported by the data), and choose that sample whose orientation is closest to parallel to the preceding orientation of the streamline.

3. Data acquisition

60 direction data

data Diffusion-weighted data were acquired on a 1.5T Siemens Sonata scanner by using echo planar imaging (72× 2mm thick axial slices, matrix size 128×104, field of view 256×208mm ², giving a voxel size of 2×2×2 mm). The diffusion weighting was isotropically distributed along 60 directions by using a b value of 1, 000smm ⁻², allowing an echo time of 89ms and a repetition time of 9s, For each set of diffusion-weighted data, 5 volumes with no diffusion weighting were acquired at points throughout the acquisition. Three sets of diffusion-weighted data were acquired for subsequent averaging to improve signal-to-noise ratio. The total scan time for the diffusion-weighted imaging (DWI) protocol was 45 min.

12 direction data

From these same data, we selected the 12 gradient orientations that performed best under the energy minimisation scheme proposed by [28], to create a 12 direction diffusion dataset with a single non-diffusion-weighted volume.

In each dataset, each scan was subsequently aligned to a reference (b=0) scan using affine registration intended to maximise mutual information [29]. Brain extraction derived from this reference scan was then applied to each volume [30].

4. Results

4.1. Local fitting of the complex fibre model

Figure 2 shows typical examples of the probabilistic multi-fibre fit to the 60 direction dataset. Note the wide extent of complex fibre architecture in figure 2 (a). In this case, the diffusion data at one third of voxels with Fractional Anisotropy > 0.1 were able to support more than one fibre orientation, but no single voxel in the dataset supported more than two orientations. In the 12 direction set, no single voxel supported more than a single fibre orientation. Note that the method is general to any number of fibre populations, given a richness of data to support them (see appendix for relevant simulations investigating the angular resolution of the technique). b) and c) show close ups in lateral portions of the brain of the crossings between the longitudinal and the medio-lateral fibres. The second fibre direction (blue) is consistent between voxels, and joins into a connected path even in fine fibre bundles (c). (d) Shows samples from the posterior distributions on orientations of motor (red) and SLF (blue) orientations in a single voxel at the crossing. In human DTI data, the SLF dominates the motor pathways at this crossing, but fitting multiple orientations reveals that the fibre populations are not dissimilar in contribution to the MR signal. In this voxel (which is typical), the mean of the posterior distributions on f ₁ and f ₂ are 0.41 and 0.32 respectively. These numbers can be interpreted as the relative contributions to the MR signal of the SLF (f ₁) and motor (f ₂) projections. Note that the slight difference in volume fractions can also be seen as a slightly wider spread in the posterior distribution on the second orientation.

An external file that holds a picture, illustration, etc.
Object name is EMS109748-f002.jpg

Open in a separate window

Figure 2

Probabilistic multi-orientation fitting a) Axial slice showing regions where more than a single orientation were supported (thresholded at f_i 0.5 after ARD-based estimation). b) and c) axial and sagittal close ups of crossing fibre bundles with dominant fibre orientation in red and second in blue. Directions shown are the mean vectors of the posterior distribution samples. d) Samples from the posterior distributions on the first two fibre orientation in a voxel (green dot in a)) where the lateral motor projections (red) cross the longitudinal SLF projections (blue). The right hand-side is a 90° rotation of the left

4.2. Probabilistic Tractography

Tracking dominant projections

To interrogate the effect of the multi-fibre model on dominant fibre projections, we performed a connectivity-based thalamic segmentation as described in [3] using both single and multi-fibre apporaches. We defined seven cortical regions (prefrontal cortex, premotor cortex, primary motor cortex, primary sensory cortex, posterior parietal cortex, occipital cortex and temporal cortex) and thalamus from population probability maps (taken from [31]), thresholded at 30%. We drew 5000 samples from the connectivity distribution from each seed voxel in thalamus and computed the probability of connection to each cortical mask. We then assigned an index to each seed voxel corresponding to the target cortical zone with the highest connection probability to the seed. Here, and in subsequent experiments, we used a curvature threshold of 80° for stopping streamline trajectories [20]. The results with the single and multi-fibre approaches in a single typical subject can be seen in figure 3. The dominant projections can be seen to be similar between the two approaches.

An external file that holds a picture, illustration, etc.
Object name is EMS109748-f003.jpg

Open in a separate window

Figure 3

Thalamic parcellation with single fibre (a) and crossing-fibre (b) tractography. Key to cortical projections is as follows. Prefrontal cortex in burgundy, premotor cortex in red, primary motor cortex in light blue, primary sensory cortex in dark blue, posterior parietal cortex in orange, occipital cortex in mid-blue and temporal cortex in yellow.

Tracking non-dominant projections

To interrogate the increased sensitivity provided by the multi-fibre model, we performed tractography on pathways where known anatomical connections have previously proved difficult to trace in diffusion data.

Motor pathways

We defined a seed mask as the internal capsule on a single slice (Z=6) of the MNI 152 average brain [32]. We then defined primary motor cortex (M1) from a population probability map (taken from [31]) thresholded at 30%. In each of nine subjects, we then drew 5000 samples from the connectivity distribution from each seed voxel and retained only those samples that passed through the M1 target mask. To avoid inter-hemispheric connections, samples were excluded from further analysis if they crossed the mid-sagittal plane. We then repeated exactly the same experiment using single fibre tractography described in [20]. Figure 4 shows maximum intensity projections of the results for the two approaches.

An external file that holds a picture, illustration, etc.
Object name is EMS109748-f004.jpg

Open in a separate window

Figure 4

Tracking the cortico-spinal tract from the internal capsule to the primary motor cortex with single fibre (left) and multi-fibre (right) tractography. A coronal maximum intensity projection is shown. Voxels are colour coded from 25 (red) to 200 (yellow) samples passing through the voxel. Single fibre results are shown as the leftmost nine (3x3) subjects, and multi-fibre results are the rightmost nine subjects. Each subject is displayed in the same location in the two grids.

In each subject, the multi-fibre approach reveals more of the cortico-spinal pathways than the single-fibre approach. However, the ability of the multi-fibre approach to find the lateral motor projections is variable across subjects.

Medial SLF

We performed the same experiment in the medial portion of the SLF. Seed and target masks were defined as rectangles covering medial parietal and medial premotor cortex on a single slice (Z=64) of the MNI-152 brain. Samples crossing the midline or MNI slice Z=40 were terminated, so as any projections running through the corpus callosum or the cingulum bundle.

Fig 5 shows maximum intensity projections of the results for the two approaches. In each subject, the medial portion of the SLF is found by the multi-fibre approach, but is invisible to the single fibre approach as it is obscured by the medial motor projections.

An external file that holds a picture, illustration, etc.
Object name is EMS109748-f005.jpg

Open in a separate window

Figure 5

Tracking of the Parietal-Premotor connections of the medial portion of the superior longitudinal fasciculus in nine subjects with single fibre (3x3 on left) and multi-fibre (3x3 on right) tractography. A sagittal maximum intensity projection is shown. Voxels are colour coded from 25 (red) to 200 (yellow) samples passing through the voxel. Note that the single fibre tractography was not able to find any connections that reached the target in any subject.

Acoustic radiations

Seeds were defined as a cuboid on three slices (Z=-6 to -2) of the MNI average brain just medial to the left lateral geniculate nucleus (e.g. [33]). The target was defined as a region encompassing primary auditory cortex on a single sagittal slice (X=-56). To exclude other thalamo-temporal pathways (see e.g. [3]) and thalamo-parietal pathways (due to misregistration between individual subject space and the MNI space target), samples crossing the following planes were terminated: Z=-12, Z=46, Y=-44, Y=4.

Figure 6 shows maximum intensity projections of the results for the two approaches. In each subject, the acoustic radiation was found by the multi-fibre approach, but was invisible to the single fibre approach as it is obscured by the optic radiations.

An external file that holds a picture, illustration, etc.
Object name is EMS109748-f006.jpg

Open in a separate window

Figure 6

Tracking of the acoustic radiations between medial geniculate nucleus of thalamus and primary auditary cortex with single fibre (left) and multi-fibre (right) tractography. An axial maximum intensity projection is shown. Voxels are colour coded from 10 (red) to 50 (yellow) samples passing through the voxel. Note that the single fibre tractography was not able to find any connections that reached the target in any subject.

Limbic pathways of the sub-genual white matter

The subgenual white matter has been a successful target for deep-brain stimualtion in patients with treatment resistant depression [34]. Beneficial affects of stimulation could be mediated by paths to remote interconnected regions of the frontal and limbic system [34]. Connections from this region to the amygdala, a region critically implicated in depression and recovery[35], are largely lacking with single fibre tractography, but connections have been found connecting the same two regions tracking in the opposite direction [6]. We generated MNI space seed and target masks of the sub-genual white matter, and the amygdala, and ran tractography using both single- and multi-fibre techniques as described above.

Figure 7 shows maximum intensity projections of the results for the two approaches. With the single-fibre algorithm, the pathway was visible in 4 of 9 subjects, and with high variability in connection likelihoods. With the multi-fibre algorithm, the pathway could be seen in all subjects, but again with varying degrees of connection likelihood. This pathway is the non-dominant pathway in the region of complex fibre architecture in the sub-genual white matter. Therefore, as expected the multi-fibre approach has higher connection likelihoods than the single-fibre approach in all subjects.

An external file that holds a picture, illustration, etc.
Object name is EMS109748-f007.jpg

Open in a separate window

Figure 7

Tracking the projection from the sub-genual white matter to amygdala using single fibre (left) and multi-fibre (right) tractography. An coronal maximum intensity projection is shown. Voxels are colour coded from 10 (red) to 100 (yellow) samples passing through the voxel.

5. Discussion

We have presented a direct extension of [20] to the case of complex fibre architecture, and shown that, using this modelling approach, it is possible to detect white matter regions of complex fibre architecture that have previously been identified by model-free inversion approaches [10, 36, 11, 12, 37]. Furthermore, by inferring on a model of local diffusion, we are able to limit the regions in which multiple fibres are detected to those where the data requires such a complex model. In order to perform this reduction, we use automatic relevance determination in a formal Bayesian estimation scheme. Using a clinically feasible acquisition scheme with 60 diffusion directions, and a b-value of 1000, we were able to detect complex fibre architecture in approximately a third of voxels with an FA greater than 0.1. As shown in [15, 19], such approaches dramatically reduce the problem of tractography in a multi-fibre field to an extent that it is tractable to probabilistic sampling schemes. Tractography approaches that rely on techniques that detect complex fibre architecture in every voxel, whether or not it is supported by the data, will be much more susceptible to false positive connections than approaches that infer complexity in a selective manner.

Here, by using a natural extension to a single fibre tractography technique, we are able to examine directly the potential advantages of estimating complex architecture before performing tractography.

Connectivity-based segmentation of thalamus [3] which relies only on identifying the target with the highest connection probability was largely unchanged between the single and multi-fibre approaches, suggesting that the same dominant pathway is found by both approaches.

Furthermore, in four separate fibre systems where non-dominant pathways have previously been hard to find, we find that multi-fibre tractography is more sensitive to non-dominant projections that single fibre tractography. For example, the medial portion of the SLF was found strongly in all subjects, whereas it is invisible to single fibre approaches. However, in some cases this non-dominant projection is found with variable success and connection likelihood from subject to subject. For example, in the cortico-spinal tract, lateral projections to primary motor cortex were not found strongly in every subject (figure 4). It has previously been reported [10] that there are regions of the cortico-spinal tract where it has been possible to estimate three populations of fibres in high b-value diffusion data. The data we collected were not able to support a third orientation. It is possible that if we acquired more diffusion directions at a higher b-value, we would detect more than two orientations (see simulations in the appendix), and therefore find the whole cortico-spinal tract in a larger proportion of subjects.

In fact, the sensitivity to detect multiple fibre populations will depend not only on the data, but also on specifics of the reconstruction technique. Techniques for inferring multiple populations of fibres from diffusion data can largely be separated into two groups[14]: Those that use a model of the underlying diffusion profile, and therefore try to estimate fibre orientations (e.g [9, 15, 12, 38]); and those that attempt to estimate the diffusion profile itself in a model-free environment (e.g. [10, 16, 39]). Two of these methods have particular similarities to the approach taken here. CHARMED imaging [38] models the diffusion signal as a mixture of hindered and restricted diffusion compartments from around and within white matter axons. These compartments are modelled with a diffusion tensor (for the extra-axonal hindered compartment) and restricted cylindrical diffusion (for the intra-axonal restricted compartment). This model can be shown to estimate multiple fibre orientations with lower uncertainty (and hence a better fit to the data) than a model which ignores the restricted component [40]. There are strong parallels between CHARMED imaging, and the compartment model used in this manuscript. The key differences are the stronger biophysical motivation for the compartment divisions giving CHARMED the potential to ask quantitative questions about, for example, relative compartment volumes; and the simplicity of the model used in this manuscript meaning that it can be used with much sparser data (of the sort that is commonly acquired in research environments, and is easily transferred to the clinical domain). Q-ball imaging[10, 41] takes a very different approach, attempting to reconstruct the entire diffusion profile, rather than fibre orientations per se. However, there are parallels with the apporach taken here. The Funk Radon projection used to estimate this profile from a shell of q-space data is a projection of the data onto each infinitely anisotropic spike (much like those used in this paper). There are also key differences here, which highlight the advantages of both the model-based and model-free estimation approaches. Using, a model-free approach (e.g. Q-ball imaging), the potential angular resolution of the reconstructed maxima in the diffusion function is limited by the diffusion response function of a single fibre, even in a noise-free environment. Multiple lobes in the diffusion function can merge into one, giving an incorrect maximum. When explicit assumptions are made about the single fibre response function (using a model-based approach), there is no theoretical limit to the angle that can be resolved, if the model is a good predictor of the signal. However, errors in the assumed model will lead to errors in the recovered fibre orientations (and uncertainties).

Notwithstanding differences between techniques, our experience suggests that the most likely connections tend to be common to the two approaches (e.g. the thalamus segmentation presented above), but that in cases where false negatives in single-fibre tractography are due to difficulties in tracing a non-dominant pathway, the multi-fibre approach gives strikingly different results. This is encouraging, as it means that the increased complexity is unlikely to result in many more false positives, but it is also discouraging, as it means that any false positive results present in the simpler approach are also likely to be present when complex fibre architecture is modelled explicitly.

In summary, we believe that explicit modelling of fibre complexity will offer significant increases in sensitivity over traditional tensor, or single-fibre approaches, but that results such approaches should be interpreted with the anatomical care that has become indispensable in tractography studies.

An external file that holds a picture, illustration, etc.
Object name is EMS109748-f008.jpg

Open in a separate window

Figure 8

Simulations of two crossing fibres with different number of diffusion encoding orientations (main x-axis), SNR (main y-axis), b-values (individual x-axes), and separation angle (individual y-axes). Simluations were repeated 5 times. Greyscale shows the average number of fibres recovered by the estimation technique.

An external file that holds a picture, illustration, etc.
Object name is EMS109748-f009.jpg

Open in a separate window

Figure 9

Simulations of three fibres crossing at 90° with different number of diffusion encoding orientations (main x-axis), b-values (individual x-axes),and SNR (individual y-axes). Simluations were repeated 5 times. Greyscale shows the average number of fibres recovered by the estimation technique.

Supplementary Material

Appendix

Click here to view.^{(92K, pdf)}

Acknowledgments

The authors would like to acknowledge funding from the UK Medical Research Council (TEJB,HJB), the Royal Society (MFSR), the UK Engineering and Physical Sciences Research Council (MWW), the Dr. Hadwen Trust (SJ) and the Wellcome Trust (HJB). We would also like to thank Professor Paul Matthews and the Isle of Islay for providing their combined support for this work.

Footnotes

¹Note that in complicated models, such as the one we will use for the diffusion data, the evidence for competing models cannot be computed analytically.

References

[1] Catani M, Howard RJ, Pajevic S, Jones DK. Virtual in vivo interactive dissection of white matter fasciculi in the human brain. Neuroimage. 2002 Sep;17:77–94. [PubMed] [Google Scholar]

[2] Stieltjes B, Kaufmann WE, van Zijl PC, Fredericksen K, Pearlson GD, Solaiyappan M, Mori S. Diffusion tensor imaging and axonal tracking in the human brainstem. Neuroimage. 2001 Sep;14:723–735. [PubMed] [Google Scholar]

[3] Behrens TEJ, Johansen-Berg H, Woolrich MW, Smith SM, Wheeler-Kingshott CAM, Boulby PA, Barker GJ, Sillery EL, Sheehan K, Ciccarelli O, Thompson AJ, et al. Non-invasive mapping of connections between human thalamus and cortex using diffusion imaging. Nat Neurosci. 2003;6(7):750–757. [PubMed] [Google Scholar]

[4] Lehericy Stephane, Ducros Mathieu, Krainik Alexandre, Francois Chantal, Van de Moortele Pierre-Francois, Ugurbil Kamil, Kim Dae-Shik. 3-D diffusion tensor axonal tracking shows distinct SMA and Pre-SMA projections to the human striatum. Cereb Cortex. 2004 Dec;14:1302–1309. [PubMed] [Google Scholar]

[5] Lazar Mariana, Alexander Andrew L. Bootstrap white matter tractography (BOOT-TRAC) Neuroimage. 2005 Jan;24:524–532. [PubMed] [Google Scholar]

[6] Croxson Paula L, Johansen-Berg Heidi, Behrens Timothy EJ, Robson Matthew D, Pinsk Mark A, Gross Charles G, Richter Wolfgang, Richter Marlene C, Kastner Sabine, Rushworth Matthew FS. Quantitative investigation of connections of the prefrontal cortex in the human and macaque using probabilistic diffusion tractography. J Neurosci. 2005 Sep;25:8854–8866. [PMC free article] [PubMed] [Google Scholar]

[7] Porter R, Lemon RN. Corticospinal Function and Voluntary Movement. Oxford University Press; Oxford, U.K: 1993. [Google Scholar]

[8] Rushworth MF, Behrens TE, Johansen-Berg H. Connection Patterns Distinguish 3 Regions of Human Parietal Cortex. Cereb Cortex. 2005 Nov; JOURNAL ARTICLE [PubMed] [Google Scholar]

[9] Tuch DS, Reese TG, Wiegell MR, Makris N, Belliveau JW, Wedeen VJ. High angular resolution diffusion imaging reveals intravoxel white matter fiber heterogeneity. Magn Reson Med. 2002;48:577–582. [PubMed] [Google Scholar]

[10] Tuch David S, Reese Timothy G, Wiegell Mette R, Wedeen Van J. Diffusion MRI of complex neural architecture. Neuron. 2003 Dec;40:885–895. [PubMed] [Google Scholar]

[11] Jansons Kalvis M, Alexander Daniel C. Persistent Angular Structure: new insights from diffusion MRI data. Dummy version. Inf Process Med Imaging. 2003 Jul;18:672–683. Evaluation Studies. [PubMed] [Google Scholar]

[12] Tournier J-Donald, Calamante Fernando, Gadian David G, Connelly Alan. Direct estimation of the fiber orientation density function from diffusion-weighted MRI data using spherical deconvolution. Neuroimage. 2004 Nov;23:1176–1185. [PubMed] [Google Scholar]

[13] Liu Chunlei, Bammer Roland, Acar Burak, Moseley Michael E. Characterizing non-Gaussian diffusion by using generalized diffusion tensors. Magn Reson Med. 2004 May;51:924–937. [PubMed] [Google Scholar]

[14] Alexander Daniel C. Multiple-fiber reconstruction algorithms for diffusion MRI. Ann N Y Acad Sci. 2005 Dec;1064:113–133. [PubMed] [Google Scholar]

[15] Hosey Tim, Williams Guy, Ansorge Richard. Inference of multiple fiber orientations in high angular resolution diffusion imaging. Magn Reson Med. 2005 Dec;54:1480–1489. [PubMed] [Google Scholar]

[16] Wedeen Van J, Hagmann Patric, Tseng Wen-Yih Isaac, Reese Timothy G, Weisskoff Robert M. Mapping complex tissue architecture with diffusion spectrum magnetic resonance imaging. Magn Reson Med. 2005 Dec;54:1377–1386. [PubMed] [Google Scholar]

[17] Basser PJ, Matiello J, Le Bihan D. Estimation of the effective self-diffusion tensor from the NMR spin echo. J Magn Reson B. 1994;103:247–254. [PubMed] [Google Scholar]

[18] Hagmann P, Reese TG, Tseng W-YI, Meuli R, Thiran J-P, Wedeen VJ. Diffusion spectrum imaging tractography in complex cerebral white matter: an investigation of the centrum semiovale. ISMRM; 2004. p. 623. [Google Scholar]

[19] Parker Geoffrey JM, Alexander Daniel C. Probabilistic anatomical connectivity derived from the microscopic persistent angular structure of cerebral tissue. Philos Trans R Soc Lond B Biol Sci. 2005 May;360:893–902. [PMC free article] [PubMed] [Google Scholar]

[20] Behrens TEJ, Woolrich MW, Jenkinson M, Johansen-Berg H, Nunes RG, Clare S, Matthews PM, Brady JM, Smith SM. Characterization and propagation of uncertainty in diffusion-weighted MR imaging. Magn Reson Med. 2003 Nov;50:1077–1088. [PubMed] [Google Scholar]

[21] MacKay DJC. Probable networks and plausible predictions - a review of practical Bayesian methods for supervised neural networks. Network: Computation in Neural Systems. 1995;6:469–505. [Google Scholar]

[22] Friston KJ, Harrison L, Penny W. Dynamic causal modelling. Neuroimage. 2003 Aug;19:1273–1302. [PubMed] [Google Scholar]

[23] Woolrich M, Smith S. Hierarchical fully bayesian spatiotemporal analysis of fmri data. Proceedings of the 7th Annual Conference on Functional Mapping of the Human Brain; Brighton, UK: Wiley Interscience; 2001. [Google Scholar]

[24] Bernardo JM, Smith AFM. Bayesian Theory. Wiley; 2000. [Google Scholar]

[25] Parker Geoff JM, Alexander Daniel C. Probabilistic Monte Carlo based mapping of cerebral connections utilising whole-brain crossing fibre information. Inf Process Med Imaging. 2003 Jul;18:684–695. Evaluation Studies. [PubMed] [Google Scholar]

[26] Hagmann P, Thiran J-P, Jonasson L, Vandergheynst P, Clarke S, Maeder P, Meuli R. DTI mapping of human brain connectivity: statistical fibre tracking and virtual dissection. Neuroimage. 2003 Jul;19:545–554. Clinical Trial. [PubMed] [Google Scholar]

[27] Jones Derek K, Pierpaoli Carlo. Confidence mapping in diffusion tensor magnetic resonance imaging tractography using a bootstrap approach. Magn Reson Med. 2005 May;53:1143–1149. [PubMed] [Google Scholar]

[28] Jones DK, Horsfield MA, Simmons A. Optimal strategies for measuring diffusion in anisotropic systems by magnetic resonance imaging. Magn Reson Med. 1999 Sep;42:515–525. [PubMed] [Google Scholar]

[29] Jenkinson M, Smith S. A global optimisation method for robust affine registration of brain images. Med Image Anal. 2001 Jun;5:143–156. [PubMed] [Google Scholar]

[30] Smith Stephen M, Zhang Yongyue, Jenkinson Mark, Chen Jacqueline, Matthews PM, Federico Antonio, De Stefano Nicola. Accurate, robust, and automated longitudinal and cross-sectional brain change analysis. Neuroimage. 2002 Sep;17:479–489. [PubMed] [Google Scholar]

[31] Johansen-Berg H, Behrens TE, Sillery E, Ciccarelli O, Thompson AJ, Smith SM, Matthews PM. Functional-Anatomical Validation and Individual Variation of Diffusion Tractography-based Segmentation of the Human Thalamus. Cereb Cortex. 2004 Jul [PubMed] [Google Scholar]

[32] Evans AC, Marrett S, Neelin P, Collins L, Worsley K, Dai W, Milot S, Meyer E, Bub D. Anatomical mapping of functional activation in stereotactic coordinate space. Neuroimage. 1992 Aug;1:43–53. [PubMed] [Google Scholar]

[33] Devlin JT, Sillery EL, Hall DA, Hobden P, Behrens TEJ, Nunes RG, Clare S, Matthews PM, Moore DR, Johansen-Berg H. Reliable identification of the auditory thalamus using multi-modal structural analyses. Neuroimage. 2006 May;30:1112–1120. [PMC free article] [PubMed] [Google Scholar]

[34] Mayberg Helen S, Lozano Andres M, Voon Valerie, McNeely Heather E, Seminowicz David, Hamani Clement, Schwalb Jason M, Kennedy Sidney H. Deep brain stimulation for treatment-resistant depression. Neuron. 2005 Mar;45:651–660. [PubMed] [Google Scholar]

[35] Drevets Wayne C. Neuroimaging abnormalities in the amygdala in mood disorders. Ann N Y Acad Sci. 2003 Apr;985:420–444. [PubMed] [Google Scholar]

[36] Tuch David S, Wisco Jonathan J, Khachaturian Mark H, Ekstrom Leeland B, Kotter Rolf, Vanduffel Wim. Q-ball imaging of macaque white matter architecture. Philos Trans R Soc Lond B Biol Sci. 2005 May;360:869–879. [PMC free article] [PubMed] [Google Scholar]

[37] Ozarslan Evren, Shepherd Timothy M, Vemuri Baba C, Blackband Stephen J, Mareci Thomas H. Fast orientation mapping from HARDI. Med Image Comput Comput Assist Interv Int Conf Med Image Comput Comput Assist Interv; 2005. pp. 156–163. [PubMed] [Google Scholar]

[38] Assaf Yaniv, Freidlin Raisa Z, Rohde Gustavo K, Basser Peter J. New modeling and experimental framework to characterize hindered and restricted water diffusion in brain white matter. Magn Reson Med. 2004 Nov;52:965–978. [PubMed] [Google Scholar]

[39] Ozarslan Evren, Mareci Thomas H. Generalized diffusion tensor imaging and analytical relationships between diffusion tensor imaging and high angular resolution diffusion imaging. Magn Reson Med. 2003 Nov;50:955–965. [PubMed] [Google Scholar]

[40] Assaf Yaniv, Basser Peter J. Composite hindered and restricted model of diffusion (CHARMED) MR imaging of the human brain. Neuroimage. 2005 Aug;27:48–58. [PubMed] [Google Scholar]

[41] Tuch David S. Q-ball imaging. Magn Reson Med. 2004 Dec;52:1358–1372. [PubMed] [Google Scholar]