## Abstract

Emerging evidence indicates that information processing, as well as learning and memory processes, in both the network and single-neuron levels are highly dependent on the correlation structure of multiple spike trains. Contemporary experimental as well as theoretical studies that involve quasi-realistic neuronal stimulation thus require a method for controlling spike train correlations. This letter introduces a general new strategy for generating multiple spike trains with exactly controlled mean firing rates and correlation structure (defined in terms of auto- and cross-correlation functions). Our approach nonlinearly transforms random gaussian-distributed processes with a predistorted correlation structure into nonnegative rate processes, which are then used to generate doubly stochastic Poisson point processes with the required correlation structure. We show how this approach can be used to generate stationary or nonstationary spike trains from small or large groups of neurons with diverse auto- and cross-correlation structures. We analyze and derive analytical formulas for the high-order correlation structure of generated spike trains and discuss the limitations of this approach.

## 1. Introduction

The principal representation of neural information is in sequences of action potentials (APs). Within the mammalian central nervous system, each neuron receives thousands of such input sequences (spike trains) from presynaptic cells. A neuron's output is believed to be dependent on not only the mean firing rate of presynaptic cells but also the degree of spatial and temporal correlation between spike trains from different presynaptic neurons (for reviews, see, e.g., Abeles, 1991; Mel, 1994; Segev & London, 2000; Hausser & Mel, 2003; London & Hausser, 2005). Furthermore, synaptic plasticity processes are also known to be highly dependent on the correlation structure between pre- and postsynaptic neurons, a process termed spike-timing-dependent plasticity (STDP; Dayan & Abbott, 2001). The multicorrelation structure of spike trains is also believed to be crucially important to the representation and processing of network information (Abeles, 1991), where even weak pair-wise correlations can lead to highly organized network states (Schneidman, Berry, Segev, & Bialek, 2006).

Methodological advances in recent years led to an experimental ability to rapidly stimulate dendrites with a high temporal resolution and at a resolution level approaching a single synapse (Matsuzaki et al., 2001; Shoham, O'Connor, Sarkisov, & Wang, 2005; Araya, Eisenthal, & Yuste, 2006; Losonczy & Magee, 2006). Similar optical tools can also be used to individually stimulate groups of presynaptic neurons (Boucsein, Nawrot, Rotter, Aertsen, & Heck, 2005; Shoham, O'Connor et al., 2005; Farah, Reutsky, & Shoham, 2007; Nikolenko, Poskanzer, & Yuste, 2007). These tools enable the study of responses to controlled yet realistically complex input patterns. Of particular interest among the large space of possible stimulation pattern is the study of responses to varying degrees of correlations between spike trains. This, however, requires a standard method for generating spike trains with a specific degree of correlation between them in order to conduct systematic studies.

The central measures of correlations among and between random processes are the auto- and cross-correlation functions (estimated by the auto- and cross-correlograms). Spike train correlation functions describe not only the coincidence of spikes, but also probabilities for spikes to appear at different relative delays, whether in two different sequences (cross-correlation function) or in the same one (autocorrelation function). Currently, there is no method that enables spike train generation with these functions predefined. Recent work by Niebur (2007) proposes a partial solution that allows the generation of spike trains with a specific degree of coincident spikes (with predefined time domain binning) between different sequences. Here we present a new approach to spike train generation (stationary or nonstationary) in continuous time (no prebinning of the time domain) with predefined auto- and cross-correlation functions. Our approach is based on analytical solutions of the nonlinear distortion of correlation structure in a linear-nonlinear-Poisson (LNP) generative model with gaussian white noise inputs.

## 2. Method

### 2.1. Outline.

We consider the problem of generating spike trains as doubly stochastic Poisson point processes (Snyder & Miller, 1991; also known as Cox processes: Cox, 1955; Cox & Lewis, 1966) with stochastic intensity functions (instantaneous rates) having specific auto- and cross-correlation functions. A Poisson process is called doubly stochastic if its rate profile is itself a random process. Interestingly, point processes generated this way preserve the full correlation structure of the corresponding stochastic intensity processes (see Knox, 1970 and proof below). This important property reduces the problem of generating correlated point processes to the problem of generating correlated continuous stochastic nonnegative processes. We next reduce the complexity of the problem by generating these nonnegative processes through a nonlinear transformation applied to correlated gaussian processes, an approach reviewed in Johnson (1994). This transformation affects not only the marginal distributions of the processes but also their correlation functions, so we create correlated gaussian processes whose correlation structure is appropriately predistorted to yield the desired final correlations in the nonnegative processes. Table 1 and Figure 1 outline the full algorithm.

### 2.2. Correlation Distortions.

Our method is based on predistorting the correlation structure of random gaussian processes in an exact way such that a nonnegative transformation applied to them will yield the desired auto- and cross-correlation functions (Johnson, 1994). We derive below the distortions resulting from three basic nonnegative transformations: exponential, square, and absolute value. We begin by restricting the discussion to correlation coefficients between draws of random variables *X _{i}* and Λ

_{i}. In section 2.4, we expand this treatment to auto- and cross-correlation functions between stochastic processes

*x*(

_{i}*t*) and λ

_{i}(

*t*) by appropriately arranging and filtering uncorrelated processes.

#### 2.2.1. Exponential Transformation.

*X*∼

_{i}*N*(0, 1) using the following transformation: The resulting variable Λ

_{i}has a log-normal distribution with expectation where we have used Wick's theorem (Simon, 1974) for mean zero, normal random variables

*v*∼

*N*(0, σ

^{2}

_{v}):

*E*[

*X*·

_{i}*X*] =

_{j}*r*, and for the case

_{ij}*i*=

*j*(

*r*= 1): Equations 2.2 and 2.5 together allow us to evaluate the parameters μ

_{ii}_{i}and σ

_{i}: Finally, from equation 2.4, we derive the correlations predistortion as From equation 2.6, we easily obtain the following restriction on the range of permissible values for

*R*:

_{ii}*R*⩾

_{ii}*E*

^{2}[Λ

_{i}].

#### 2.2.2. Square Transformation.

*X*∼

_{i}*N*(0, 1): The resulting random variable Λ

_{i}has an unnormalized non-central

*X*

^{2}

_{1}distribution with expectation This transformation distorts the correlation coefficients of the gaussian variables into where we have used Isserlis's formula (1918) to calculate the fourth-order moments of zero mean gaussian variables: The quadratic equation, 2.10, has the following solutions for the predistortion: Solving equation 2.13 together with 2.9 yields the transformation's parameters, with the following bounds for the nonnegative process variance derived from equation 2.14:

*E*

^{2}[Λ

_{i}] ⩽

*R*⩽ 3

_{ii}*E*

^{2}[Λ

_{i}].

#### 2.2.3. Absolute Value Transformation.

In the appendix, we derive a numerical solution for this transformation, which is inherently computationally more expensive compared to the analytical solutions of the two other transformations above.

### 2.3. Doubly Stochastic Poisson Processes Possess the Same Correlations as Their Stochastic Intensity Functions.

*Definitions:*

*N*(

*t*), which has unit increments every time there is an event and remains constant until the next event occurs. Its respective increment process equals unity at all times

*t*where an event occurs, and zero otherwise. For regular point processes, the conditional probability for an event obeys where λ(

*t*) is the stochastic intensity, or instantaneous firing rate of the point process at time

*t*. Nonstationary auto- and cross-correlation functions between two stochastic intensity functions are defined as Similarly, for point processes,

*The instantaneous cross-correlation function RP _{ij}(t, τ) between two doubly stochastic Poisson processes is equal to the cross-correlation function between their respective intensity functions R_{ij}(t, τ). This property also holds for the autocorrelation function of a single point process, except for τ = 0, where it behaves as a delta function*.

*t*and using equation 2.16, we arrive at and then: Similarly for an autocorrelation of a single point process, we get (τ ≠ 0): And in the case of τ = 0, thereby completing the proof.

### 2.4. Generating the Correlated Processes.

*r*(τ), we construct a matrix containing all the target pair-wise correlations in block Toeplitz structure. To create the desired correlated gaussian processes, we construct finite impulse response (FIR) filters and apply them to

_{ij}*N*white gaussian processes

*w*(

_{i}*t*) of length

*L*in the discrete time form. To apply these filters in the stationary setting, we first build Toeplitz matrices

**W**where each of the

_{i}*N*processes of length

*L*is copied into 2

*K*+ 1 rows with increasing delays (−

*K*· Δ

*t*⩽ τ ⩽

*K*· Δ

*t*). These matrices, when concatenated vertically, construct a matrix that contains all the processes with all the required delayed replications:

*r*(τ) =

_{ij}*E*[

*x*(

_{i}*t*+ τ) ·

*x*(

_{j}*t*)] requires the original white processes to be multiplied by the correlation matrix square root (Johnson, 1994): , where and is a block matrix composed of Toeplitz matrices corresponding to each of the auto- and cross-correlation functions:

In practice, is highly redundant as a consequence of the replication procedure applied in equations 2.19 and 2.20, and we select only *N* of its rows (the central row in each block) to perform the filtering. In the nonstationary case, the matrices **W**_{i} and **r**_{ij} do not have a Toeplitz structure, and the matrix is not redundant in this case.

We note that in addition to this FIR-based approach, an alterative method for constructing multivariate correlated gaussian processes using multivariate autoregressive models is also possible here.

Once the correlated gaussian processes have been generated, they are nonlinearly transformed into rate processes using one of the transformations discussed above. Generating the Poisson spike trains from these inhomogeneous rate processes λ(*t*) proceeds using the standard procedure described by Johnson (1996, eq. 7). In this procedure, a sequence of independent exponentially distributed random variables ζ_{i} ∼ exp(1) is generated, and then the event times *t _{k}* are calculated by solving .

## 3. Results

Using our approach, it is possible to generate spike trains with different correlation structures that are typical for real neural activity (see Figures 2C–2F for examples). Cross-correlation function between two signals changes while propagating through the steps of the algorithm and corresponds to its theoretical values at every step (see Figures 2A–2C).

The generated point processes have intrinsically much more variability than the underlying intensity processes. This can be seen reflected in the intrinsic variance of the correlation function estimates (see Figures 2C–2F).

In principle, the approach developed in this letter is general enough to also generate nonstationary point processes with a predefined correlation structure. We tested the algorithm's ability to generate spike trains with time-varying rates and instantaneous, time-varying correlation structures. Correlation structures in this general scenario are typically represented in neuroscience research using the joint peristimulus time histogram (JPSTH). Figure 3 presents a predefined and a data-estimated JPSTH for processes whose mean has a gaussian temporal profile and displays a complex biphasic time-varying interaction. As can be seen, the algorithm can jointly capture the full temporal structure of both time-varying rates and the interaction.

Next, we tested our algorithm's ability to simulate spike trains resulting from networks of pair-interacting neurons. Figure 4 shows typical activity patterns in a network of 100 units simulated by controlling pair-wise cross-correlation functions. Varying this correlation structure leads to a large diversity of network activity patterns that can be simulated.

## 4. High-Order Correlation Structure

Nonlinearly transforming gaussian processes, where the entire statistical structure is determined by the first- and second-order moments, introduces higher-order moments, which then carry over to the spike train statistics. These moments depend exclusively on the nonlinear function applied. In Figures 5A and 5B, we present the complexity distribution histogram and quantile-quantile plots for the distribution of synchronous events in spike trains with the same first- and second-order statistics generated by the three different nonlinear transformations. The complexity distribution is a useful way of characterizing the zero-lag high-order correlation structure that is defined in Grün, Abeles, and Diesmann (2008) as the probability *p*(ξ) to observe a particular number of synchronous spikes ξ (complexity). These results support the notion that the three nonlinear transformations lead to a different higher-order correlation structure. While the mean level of coincidences is the same (here *E*(ξ) = 5), the complexity distribution resulting from the exponential nonlinear transformation has significantly wider tails (corresponding to a higher proportion of highly synchronous events). The square and absolute value transformations appear to lead to a fairly similar high-order correlation structure.

_{i}= exp(μ

_{i}+ σ

_{i}

*X*). The

_{i}*n*-order correlation of transformed correlated normally distributed variables

*X*∼

_{i}*N*(0, 1);

*E*[

*X*] =

_{i}X_{j}*r*has the following form: where the second step follows from Wick's theorem. Expanding and then collecting terms, which can be further simplified using equation 2.4:

_{ij}## 5. Discussion

The generation of continuous gaussian processes with a predefined correlation structure is a classical problem with well-known solutions and numerous applications in all fields of signal and image processing. However, while the empirical correlation structure of neural spike trains has been measured many thousands of times using well-known analysis tools introduced in the 1960s (Perkel, Gerstein, & Moore, 1967a, 1967b; Gerstein & Perkel, 1969), the development of a complementary simulation tool was left largely ignored for four decades and has only recently begun to attract attention. The recent interest in controlled-correlation spike trains is mostly motivated by the emergence of experimental techniques for rapid patterned stimulation of neurons with high temporal and spatial resolution (Shoham, O'Connor et al., 2005; Gasparini & Magee, 2006; Losonczy & Magee, 2006; Nikolenko, et al., 2007), as well as the recent suggestion in neural coding studies that second-order statistics dominate the observed neuroretinal spiking pattern and can be used to explain essentially the full structure (Schneidman et al., 2006; Shlens et al., 2006). In addition, these developments could potentially find applications in econometric time series, earthquake data, network traffic, reliability research, optical communication, and heartbeat variability.

It is interesting to compare our approach with other methods developed recently for controlling the correlation structure of spike trains. The method by Niebur (2007) probabilistically distributes spikes from a “mother” spike train (Kuhn, Aertsen, & Rotter, 2003) to tune synchrony in stationary homogeneous Poisson processes. For addressing the more general problem of temporal correlation functions among different spike trains, Brette (2009, method II) extended this approach by adding a controlled temporal jitter to the spike timing. However, this general strategy was solved only for stationary cases, introduces only positive correlations, and generally leads to strong coupling between the auto- and cross-correlation structure (Tetzlaff et al., 2008). It is also not well connected to central concepts in the theory of random point processes, such as stochastic intensity, and could potentially introduce spurious high-order interdependencies.

In contrast, our strategy was based on using correlated gaussian processes as an underlying structure for generating the spike trains. This general concept was also developed independently in two very recent studies (Brette, 2009; Macke, Berens, Ecker, Tolias, & Bethge, 2009), in both cases using a threshold nonlinearity. The study by Brette (2009, method I) generates rate processes by applying a rectifying nonlinearity to gaussian processes with a specific exponential autocorrelation function. The rectifying nonlinearity is conceptually close to the absolute-value transformation studied here, and formulas for the correlation distortion it creates can be derived using the procedures we introduce in the appendix (see also above). However, as noted by Brette (2009), it has the undesirable property that when the correlations are strong, the rates are zero most of the time, and it is therefore of limited use for simulating realistic spike trains. In addition, Brette limits his discussion to the case of low correlations, without the concept of correlation distortion, which is central to the generality of our approach. In an alternative “dichotomized gaussian” approach introduced by Macke et al. (2009), gaussian processes are thresholded to directly generate the binary spike trains in the time domain, without going through the intermediate step of a rate process (see also Bethge & Berens, 2008, and Johnson, 1994, section IIID, for related analysis). Because the second step is deterministic, the structure is single stochastic, in contrast to the doubly stochastic Poisson structure explored here. The dichotomized gaussian approach requires numerical solutions of nonlinear equations and was found to lead to spike trains with nearly maximal entropy.

Our methods can generate a flexible and very broad range of auto- and cross-correlation structures. For example, the method allows the generation of processes with the same autocorrelation and very different cross-correlation structures (see Figure 2), and either one can practically be very different from an exponential function (in contrast to Brette, 2009). The method is practically limited to correlation structures that lead to positive-definite matrices (see equation 2.20). More generally, the method is limited to correlation structures that are realizable by pure rate covariation, without any precise spike coordination. (Staude, Rotter, & Grün, 2008). The analysis by Staude et al. demonstrated that rate-covariation models have peak cross-correlation between spike trains that are weaker than the respective peak autocorrelations (the smaller of the two). Moreover, when two processes have slow dynamics (broad autocorrelations), they have a limited ability to sustain a narrow cross-correlation.

## Appendix: Correlation Distortion for an Absolute Value Transformation

We consider the transformation Λ_{i} = |μ_{i} + σ_{i}*X _{i}*|, where

*X*∼

_{i}*N*(0, 1) with and

*r*= 1 that results in folded normal distribution of Λ

_{ii}_{i}. The solution is divided into two steps. First, we derive a solution for μ

_{i}and σ

_{i}as a function of a desired

*R*=

_{ii}*E*[Λ

^{2}

_{i}] and

*E*[Λ

_{i}] and then proceed to derive

*r*as a function of the calculated μ

_{ij}_{i}, σ

_{i}and the desired

*R*. Expressions for the first and second moments of truncated gaussian random variables (both univariate and bivariate) were derived earlier in Rosenbaum (1961). To derive moments for the absolute value transformation, we treat it as a superposition of two (univariate) or four (bivariate) rectified gaussian random variables.

_{ij}### A.1. Step 1.

_{i}(which can be seen as a sum of two truncated, scaled, and shifted gaussian random variables): Several algebraic steps were omitted in the last expression.

Bounds on *R _{ii}* can be derived from the above expressions: .

*h*that can be found numerically. After

*h*is found, the transformation parameters can be calculated by deriving from equation A.7:

### A.2. Step 2.

*R*is a function of a single variable

_{ij}*r*(μ

_{ij}_{i,j}and σ

_{i,j}were calculated previously). Predistorted value

*r*can be calculated by numerically solving this equation.

_{ij}In summary, we calculate the transformation parameters (μ and σ) from the desired mean rate and the autocorrelation function at τ = 0 of the rate process, and then proceed to calculate the predistorted correlations *r _{ij}* for the gaussian processes from the desired correlations

*R*of the nonnegative processes.

_{ij}## Acknowledgments

This work was supported by Israeli Science Foundation grant 1248/06 and European Research Council starting grant 211055. We thank the two anonymous reviewers for their comments and suggestions.

## References

_{2}Euclidean (quantum) field theory