### Abstract

In this study, we consider limit theorems for microscopic stochastic models of neural fields. We show that the Wilson–Cowan equation can be obtained as the limit in uniform convergence on compacts in probability for a sequence of microscopic models when the number of neuron populations distributed in space and the number of neurons per population tend to infinity. This result also allows to obtain limits for qualitatively different stochastic convergence concepts, e.g., convergence in the mean. Further, we present a central limit theorem for the martingale part of the microscopic models which, suitably re-scaled, converges to a centred Gaussian process with independent increments. These two results provide the basis for presenting the neural field Langevin equation, a stochastic differential equation taking values in a Hilbert space, which is the infinite-dimensional analogue of the chemical Langevin equation in the present setting. On a technical level, we apply recently developed law of large numbers and central limit theorems for piecewise deterministic processes taking values in Hilbert spaces to a master equation formulation of stochastic neuronal network models. These theorems are valid for processes taking values in Hilbert spaces, and by this are able to incorporate spatial structures of the underlying model.

**Mathematics Subject Classification (2000): **
60F05, 60J25, 60J75, 92C20.

##### Keywords:

Stochastic neural field equation; Wilson–Cowan model; Piecewise deterministic Markov process; Stochastic processes in infinite dimensions; Law of large numbers; Martingale central limit theorem; Chemical Langevin equation### 1 Introduction

The present study is concerned with the derivation and justification of neural field equations from finite size stochastic particle models, i.e., stochastic models for the behaviour of individual neurons distributed in finitely many populations, in terms of mathematically precise probabilistic limit theorems. We illustrate this approach with the example of the Wilson–Cowan equation

We focus on the following two aspects:

(A) Often one wants to study deterministic equations such as Eq. (1.1) in order to obtain results on the ‘behaviour in the mean’ of an intrinsically stochastic system. Thus, we first discuss limit theorems of the law of large numbers type for the limit of infinitely many particles. These theorems connect the trajectories of the stochastic particle models to the deterministic solution of mean field equations, and hence provide a justification studying Eq. (1.1) in order to infer on the behaviour of the stochastic system.

(B) Secondly, we aim to characterise the internal noise structure of the complex discrete
stochastic models as in the limit of large numbers of neurons the noise is expected
to be close to a simpler stochastic process. Ultimately, this yields a stochastic
neural field model in terms of a stochastic evolution equation conceptually analogous
to the *Chemical Langevin Equation*. The Chemical Langevin Equation is widely used in the study of chemical reactions
networks for which the stochastic effects cannot be neglected but a numerical or analytical
study of the exact discrete model is not possible due to its inherent complexity.

In this study, we understand as a *microscopic model* a description as a stochastic process, usually a Markov chain model, also called
a *master equation formulation* (cf. [3,5,8,9,22] containing various master equation formulations of neural dynamics). In contrast,
a *macroscopic model* is a deterministic evolution equation such as (1.1). Deterministic mean field equations
have been used widely and for a long time to model and analyse large scale behaviour
of the brain. In their original deterministic form, they are successfully used to
model geometric visual hallucinations, orientation tuning in the visual cortex and
wave propagation in cortical slices to mention only a few applications. We refer to
[7] for a recent review and an extensive list of references. The derivation of these
equations is based on a number of arguments from statistical physics and for a long
time a justification from microscopic models has not been available. The interest
in deriving mean field equations from stochastic microscopic model has been revived
recently as it contains the possibility to derive deterministic ‘corrections’ to the
mean field equations, also called second-order approximations. These corrections might
account for the inherent stochasticity, and thus incorporate so called finite size
effects. This has been achieved by either applying a path-integral approach to the
master equation [8,9] or by a van Kampen system-size expansion of the master equation [5]. In more detail, the author in the latter reference proposes a particular master
equation for a finite number of neuron populations and derives the Wilson–Cowan equation
as the first-order approximation to the mean via employing the van Kampen system size
expansion and then taking the continuum limit for a continuum of populations. In keeping
also the second-order terms, a ‘stochastic’ version of the mean field equation is
also presented in the sense of coupling the first moment equation to an equation for
the second moments.

However, the van Kampen system size expansion does not give a precise mathematical
connection, as it neither quantifies the type of convergence (quality of the limit),
states conditions when the convergence is valid nor does it allow to characterise
the speed of convergence. Furthermore, particular care has to be taken in systems
possessing multiple fixed points of the macroscopic equation, and we refer to [5] for a discussion of this aspect in the neural field setting. The limited applicability
of the van Kampen system size expansion was already well known to Sect. 10 in van
Kampen [33]. In parallel to the work of van Kampen, T. Kurtz derived precise limit theorems connecting
sequences of continuous time Markov chains to solutions of systems of ordinary differential
equations; see the seminal studies [19,20] or the monograph [15]. Limit theorems of that type are usually called the *fluid limit*, *thermodynamic limit*, or *hydrodynamic limit*; for a review, see, e.g., [13].

As is thoroughly discussed in [5] establishing the connection between master equation models and mean field equations involves two limit procedures. First, a limit which takes the number of particles, in this case neurons per considered population, to infinity (thermodynamic limit), and a second which gives the mean field by taking the number of populations to infinity (continuum limit). In this ‘double limit’, the theorems by Kurtz describe the connection of taking the number of neurons per population to infinity yielding a system of ordinary differential equation, one for each population. Then the extension from finite to infinite dimensional state space is obtained by a continuum limit. This procedure corresponds to the approach in [5]. Thus, taking the double limit step by step raises the question what happens if we first take the spatial limit and then the fluid limit, thus reversing the order of the limit procedures, or in the case of taking the limits simultaneously. Recently, in an extension to the work of Kurtz, one of the present authors and co-authors established limit theorems that achieve this double limit [27], thus being able to connect directly finite population master equation formulations to spatio-temporal limit systems, e.g., partial differential equation or integro-differential equations such as the Wilson–Cowan equation (1.1). In a general framework, these limit theorems were derived for Piecewise Deterministic Markov Processes on Hilbert spaces, which in addition to the jump evolution also allow for a coupled deterministic continuous evolution. This generality was motivated by applications to neuron membrane models consisting of microscopic models of the ion channels coupled to a deterministic equation for the transmembrane potential. We find that this generality is also advantageous for the present situation of a pure jump model as it allows to include time-dependent inputs. In this study, we employ these theorems to achieve the aims (A) and (B) focussing on the example of the deterministic limit given by the Wilson–Cowan equation (1.1).

Finally, we state what this study does *not* contain, which in particular distinguishes the present study from [5,8,9] beyond mathematical technique. Presently, the aim is not to derive moment equations,
i.e., a deterministic set of equations that approximate the moments of the Markovian
particle model, but rather processes (deterministic or stochastic) to which a sequence
of microscopic models converges under suitable conditions in a probabilistic way.
This means that a microscopic model, which is close to the limit—presently corresponding
to a large number of neurons in a large number of populations—can be assumed to be
close to the limiting processes in structure and pathwise dynamics as indicated by
the quality of the stochastic limit. Hence, the present work is conceptually—though
neither in technique nor results—close to [30] wherein using a propagation to chaos approach in the vicinity of neural field equations
the author also derives in a mathematically precise way a limiting process to finite
particle models. However, it is an obvious consequence that the convergence of the
models necessarily implies a close resemblance of their moment equations. This provides
the connection to [5,8,9], which we briefly comment on in Appendix B.

As a guide, we close this introduction with an outline of the subsequent sections and some general remarks on the notation employed in this study. In Sects. 1.1 to 1.3, we first discuss the two types of mean field models in more detail, on the one hand, the Wilson–Cowan equation as the macroscopic limit and, on the other hand, a master equation formulation of a stochastic neural field. The main results of the paper are found in Sect. 2. There we set up the sequence of microscopic models and state conditions for convergence. Limit theorems of the law of large numbers type are presented in Theorem 2.1 and Theorem 2.2 in Sect. 2.1. The first is a classical weak law of large numbers providing uniform convergence on compacts in probability and the second convergence in the mean uniformly over the whole positive time axis. Next, a central limit theorem for the martingale part of the microscopic models is presented in Sect. 2.2 characterising the internal fluctuations of the model to be of a diffusive nature in the limit. This part of the study is concluded in Sect. 2.3 by presenting the Langevin approximations that arise as a result of the preceding limit theorems. The proofs of the theorems in Sect. 2 are deferred to Sect. 4. The study is concluded in Sect. 3 with a discussion of the implications of the presented results and an extension of these limit theorems to different master equation formulations or mean field equations.

*Notations and Conventions* Throughout the study, we denote by
*D* are always bounded with a sufficiently smooth boundary, where the minimal assumption
is a strong local Lipschitz condition; see [2]. For bounded domains *D*, this condition simply means that for every point on the boundary its neighbourhood
on the boundary is the graph of a Lipschitz continuous function. Furthermore, for
^{a} in
*ϕ* to *ψ*. Furthermore, the spaces

Norms in Hilbert spaces are denoted by

#### 1.1 The Macroscopic Limit

Neural field equations are usually classified into two types: *rate-based* and *activity-based* models. The prototype of the former is the Wilson–Cowan equation; see Eq. (1.1),
which we also restate below, and the Amari equation, see Eq. (3.7) in Sect. 3, is
the prototype of the latter. Besides being of a different structure, due to their
derivation, the variable they describe has a completely different interpretation.
In rate-based models, the variable describes the average rate of activity at a certain
location and time, roughly corresponding to the fraction of active neurons at a certain
infinitesimal area. In activity-based models, the macroscopic variable is an average
electrical potential produced by neurons at a certain location. For a concise physical
derivation that leads to these models, we refer to [5]. In the following, we consider rate-based equations, in particular, the classical
Wilson–Cowan equation, to discuss the type of limit theorems we are able to obtain.
We remark that the results are essentially analogous for activity based models.

Thus, the macroscopic model of interest is given by the equation

where
*y* to a neuron located at *x*, and finally,
*x* at time *t*. For the weight function
*I*, we assume that
*f*, we assume in this study that *f* is non-negative, satisfies a global Lipschitz condition with constant

and it is bounded. From an interpretive point-of-view, it is reasonable and consistent
to stipulate that *f* is bounded by one—being a fraction—as well as being monotone. The latter property
corresponds to the fact that higher input results in higher activity. In specific
models, *f* is often chosen to be a sigmoidal function, e.g.,
*f* are even infinitely often differentiable with bounded derivatives, which already
implies the Lipschitz condition (1.4).

The Wilson–Cowan equation (1.3) is well-posed in the strong sense as an integral equation
in
*ν* to every initial condition
*D*, then it holds for all
*f* is at least *α*-times differentiable with bounded derivatives and the weights and the input function
satisfy
*ν* is jointly continuous on

#### 1.2 Master Equation Formulations of Neural Network Models

For the microscopic model, we concentrate on a variation of the model considered in [5,6], which is already an improvement on a model introduced in [11]. We extend the model including variations among neuron populations and foremost time-dependent inputs. We chose this model over the master equation formulations in [8,9] as it provides a more direct connection of the microscopic and macroscopic models; see also the discussion in Sect. 3. We describe the main ingredients of the model beginning with the simpler, time-independent model as prevalent in the literature. Subsequently, in Sect. 1.3 the final, time-dependent model is defined.

We denote by *P* the number of neuron populations in the model. Further, we assume that the *k*th neuron population consists of identical neurons which can either be in one of two
possible states, *active*, i.e., emitting action potentials, and *inactive*, i.e., quiescent or not emitting action potentials. Transitions between states occur
instantaneously and at random times. For all
*t*. An integer
*k*th population, at least for sufficiently large values. However, this is not accurate
in the literal sense as it is possible with positive probability for populations to
contain more than
^{b} It is a corollary of these that the probability of more then

Proceeding with notation,
*k* are governed by a constant inactivation rate
*k* receives from a neuron in population *j*. Then the activation rate of a neuron in population *k* is proportional to

for a non-negative function
*f* in the Wilson–Cowan equation (1.3). We remark that here *f* is *not* the rate of activation of one neuron. In this model, the activation rate of a population
is not proportional to the number of inactive neurons but it is proportional to
*becomes or remains* active.

It follows that the process
*k*th basis vector of

which is endowed with the boundary conditions
*θ* at time *t*. Finally, the definition is completed with stating an initial law ℒ, the distribution
of

Another definition of a continuous-time Markov chain is via its generator; see, e.g., [15]. Although the master equation is widely used in the physics and chemical reactions literature the mathematically more appropriate object for the study of a Markov process is its generator and the master equation is an object derived from the generator, see Sect. V in [33]. The generator of a Markov process is an operator defined on the space of real functions over the state space of the process. For the above model defined by the master equation (1.6), the generator is given by

for all suitable
*λ* is the total instantaneous jump rate, given by

and defines the distribution of the waiting time until the next jump, i.e.,

Further, the measure *μ* in (1.7) is a Markov kernel on the state space of the process defining the conditional
distribution of the post-jump value, i.e.,

for all sets
*θ*, the measure *μ* is given by the discrete distribution

The importance of the generator lies in the fact that it fully characterises a Markov process and that convergence of Markov processes is strongly connected to the convergence of their generators; see [15].

#### 1.3 Including External Time-Dependent Input

Until now, the microscopic model does not incorporate any time-dependent input into
the system. In analogy to the macroscopic equation (1.3), this input enters into the
model inside the active rate function
*k* at time *t*, then the time-dependent activation rate is given by

The most important qualitative difference when substituting (1.5) by (1.11) is that the corresponding Markov process is no longer homogeneous. In particular, the waiting time distributions in between jumps are no longer exponential, but satisfy

Hence, the resulting process is an inhomogeneous continuous-time Markov chain; see,
e.g., Sect. 2 in [36]. It is straightforward to write down the corresponding master equation analogously
to (1.6) yielding a system of non-autonomous ordinary differential equations, cf.
the master equation formulation in [8]. Similarly, there exists the notion of a time-dependent generator for inhomogeneous
Markov processes, cf. Sect. 4.7 in [15]. Employing a standard trick, that is, suitably extending the state space of the process,
we can transform a inhomogeneous to a homogeneous Markov process [15,28]. That is, the space-time process

where the jump intensity *λ* is given by the sum of all individual time-dependent rates analogously to (1.8).
Finally, the post jump value is given by a Markov kernel
*μ* is the obvious time-dependent modification of (1.10).

It thus follows, that the space-time process

### 2 A Precise Formulation of the Limit Theorems

In this section, we present the precise formulations of the limit theorems. To this
end, we first define a suitable sequence of microscopic models, which gives the connection
between the defining objects of the Wilson–Cowan equation (1.3) and the microscopic
models discussed in Sect. 1.2. Thus,
*n*. That is
*n*th model,
*k*th population of the *n*th model and analogously we use the notations
*n* and *τ* is the time constant in the Wilson–Cowan equation (1.3). In the following paragraphs,
we discuss the connection of the defining components of this sequence of microscopic
models to the components of the macroscopic limit.

*Connection to the Spatial Domain D* A key step of connecting the microscopic models to the solution of Eq. (1.3) is that
we need to put the individual neuron populations into relation to the spatial domain
*D* the solution of (1.3) lives on. To this end, we assume that each population is located
within a sub-domain of *D* and that the sub-domains of the individual populations are non-overlapping. Hence,
for each
*D* denoted by
*D*, e.g., all Jordan measurable domains, a sequence of convex partitions can be found
such that additionally the conditions imposed in the limit theorems below are also
satisfied. One may think of obtaining the collection
*D* nor that the partitions consists of refinements. Necessary conditions on the limiting
behaviour of the sub-domains are very strongly connected to the convergence of initial
conditions of the models, which is a condition in the limit theorems; see below. For
the sake of terminological simplicity, we refer to

We now define some notation for parameters characterising the partitions

and the maximum diameter of the partition is denoted by

where the diameter of a set
*n*th model, i.e.,

*Connection to the Weight Function w* We assume that there exists a function

where *w* is the same function as in the Wilson–Cowan equation (1.3). For the definition of
activation rate at time *t*, we thus obtain

As already highlighted by Bressloff [5], the transition rates are not uniquely defined by the requirement that a possible limit to the microscopic models is given by the Wilson–Cowan equation (1.3). If in (2.5), the definition of the transition rates is changed to

where
*f*, then all limit theorems remain valid. The proof can be carried out as presented
adding and subtracting the appropriate term where the additional difference term vanishes
due to
*τ*, the weights *w*, and the input *I*.

*Connection to the Input Current I*

The external input which is applied to neurons in a certain population is obtained by spatially averaging a space-time input over the sub-domain that population is located in, i.e.,

This completes the definition of the Markov jump processes

and the transition measure

for all

*Connection to the Solution ν* As functions of time, the paths of the PDMP
*ν* live on different state spaces. The former takes values in
*coordinate function*, which is also the terminology used in [13]. In fact, the limit theorems we subsequently present actually are for the processes
we obtain from the composition of the coordinate functions with the PDMPs. Here, it
is important to note that for each

Clearly, each

*Connection of the Initial Conditions*

One condition in the subsequent limit theorems is the convergence of initial conditions in probability, i.e., the assumption that

It is easy to see that such a sequence of initial conditions
*D* and the sequence of partitions

Next, assuming that partitions fill the whole domain *D* for
*D*. Assume that *D* is Jordan measurable, i.e., a bounded domain such that the boundary is a Lebesgue
null set, and let
*D*. We define
*D*. As *D* is Jordan measurable, these partitions fill up *D* from inside and

In the remainder of this section, we now collect the main results of this article. We start with the law of large numbers, which establishes the connection to the deterministic mean field equation, and then proceed to central limit theorems which provide the basis for a Langevin approximation. The proofs of the results are deferred to Sect. 4.

#### 2.1 A Law of Large Numbers

The first law of large numbers takes the following form. Note that the assumptions imply that the number of neuron populations diverges.

**Theorem 2.1** (Law of large numbers)

*Let*
*and*
*Assume that the sequence of initial conditions converges to*
*in probability in the space*
*i*.*e*., (2.8) *holds*, *that*
*and that*

*holds*. *Then it follows that the sequence of*
*valued jump*-*process*
*converges uniformly on compact time intervals in probability to the solution**ν**of the Wilson–Cowan equation* (1.3), *i*.*e*., *for all*
*it holds that*

*Moreover*, *if for*
*the initial conditions satisfy in addition*
*then convergence in the**rth mean holds*, *i*.*e*., *for all*

*Remark 2.1* The norm of the uniform convergence

**Corollary 2.1***Let*
*and set*

*Further*, *assume that*
*and*
*and that the sequence of initial conditions converges to*
*in probability in the space*
*that*
*and*

*where* 1− *denotes an arbitrary positive number strictly smaller than* 1. *Then it holds for all*
*that*

*and for*
*if the additional boundedness assumptions of Theorem *2.1 *are satisfied*, *that for all*

*Remark 2.2* We believe that fruitful and illustrative comparisons of these convergence results
and their conditions to the results in Kotelenez [17,18], and particularly, Blount [4] can be made. Here, we just mention that the latter author conjectured the conditions
(2.13) to be optimal for the convergence, but was not able to prove this result in
his model of chemical reactions with diffusions for the region

#### 2.1.1 Infinite-Time Convergence

In the law of large numbers, Theorem 2.1, and its Corollary 2.1 we have presented
results of convergence over finite time intervals. Employing a different technique,
we are also able to derive a convergence result over the whole positive time axis
motivated by a similar result in [32]. The proof of the following theorem is deferred to Sect. 4.3. Restricted to finite
time intervals, the subsequent result is strictly weaker than Theorem 2.1. However,
the result is important when one wants to analyse the mean long time behaviour of
the stochastic model via a bifurcation analysis of the deterministic limit as (2.14)
suggests that
*n*.

**Theorem 2.2***Let*
*and assume that the conditions of Corollary *2.1 *are satisfied*. *We further assume that the current input function*
*satisfies*
*i*.*e*., *it is square integrable in*
*over bounded intervals*, *and possesses first spatial derivatives bounded for almost all*
*in*
*Then it holds that*

#### 2.2 A Martingale Central Limit Theorem

In this section, we present a central limit theorem for a sequence of martingales
associated with the jump processes

Here, the process
*ν* of the Wilson–Cowan equation (1.3). Now interpreting Eq. (2.15) as a stochastic evolution
equation, which is driven by the martingale
*n*. Deriving a suitable limit for

First of all, what has been said so far implies the necessity of re-scaling the martingale with a diverging sequence in order to obtain a non-trivial limit. The conditions in the law of large numbers imply in particular that the martingale converges uniformly in the mean square to zero, i.e.,

which in turn implies convergence in probability and convergence in distribution to the zero limit.

Furthermore, in contrast to Euclidean spaces norms on infinite-dimensional spaces
are usually not equivalent. In Corollary 2.1, we exploited this fact as it allowed
us to obtain convergence results under less restrictive conditions by changing to
strictly weaker norms. In the formulation and proof of central limit theorems, the
change to weaker norms even becomes an essential ingredient. It is often observed
in the literature, see, e.g., [4,17,18] that central limit theorems cannot be proven in the strongest norm for which the
law of large numbers holds, e.g.,
*d* is the dimension of the spatial domain *D*, using the embedding of
^{c} due to Maurin’s theorem and

The limit we propose for the re-scaled martingale sequence is a *centred diffusion process* in
^{d} if

(i) each
*symmetric* and *positive*, i.e.,

(ii) each
*trace class*, i.e., for one (and thus every) orthonormal basis

(iii) and the family
*continuously increasing* in *t* in the sense that the map

We next define the process, which will be the limit identified in the martingale central
limit theorem via its covariance. In order to define the operator *C*, we first define a family of linear operators

It is obvious that this bilinear form is symmetric and positive and, as
*t*, it holds that the map

as the solution of the Wilson–Cowan equation *ν* and the gain function *f* are pointwise bounded. Hence, due to the Cauchy–Schwarz inequality, the norm
*ϕ*, *ψ* in

Summing these inequalities for all
*t*.

Now, it holds that the map

Clearly, the resulting bilinear form
*t* for all

We are now able to state the martingale central limit theorem. The proof of the theorem is deferred to Sect. 4.4.

**Theorem 2.3** (Martingale central limit theorem)

*Let*
*and assume that the conditions of Theorem *2.1 *are satisfied*. *In particular*, *convergence in the mean holds*, *i*.*e*., (2.11) *holds for*
*Additionally*, *we assume it holds that*

*Then it follows that the sequence of re*-*scaled*
*valued martingales*

*converges weakly on the space of*
*valued càdlàg function to the*
*valued diffusion process defined by the covariance operator*
*given by* (2.18).

*Remark 2.3* In connection with the results of Theorem 2.3, two questions may arise. First, in
what sense is there uniqueness of the re-scaling sequence, and hence of the limiting
diffusion? That is, does a different scaling also produce a (non-trivial) limit, or,
rephrased, is the proposed scaling the correct one to look at? Secondly, the theorem
deals with the norms for the range of

Regarding the first question, it is immediately obvious that the re-scaling sequence
*c*, and hence the limit is essentially the same process with either ‘stretched’ or ‘shrinked’
variability. However, the asymptotic behaviour of the re-scaling sequences, which
allow for a non-trivial weak limit is unique. In general, by considering different
re-scaling sequences

Unfortunately, an answer to the second question is not possible in this clarity, when
considering non-trivial limits. Essentially, we can only say that the currently used
methods do not allow for any conclusion on convergence. The limitations are the following:
The central problem is that for the parameter range

#### 2.3 The Mean-Field Langevin Equation

An important property of the limiting diffusion in view toward analytic and numerical
studies is that it can be represented by a stochastic integral with respect to a cylindrical
or *Q*-Wiener process. For a general discussion of infinite-dimensional stochastic integrals,
we refer to [12]. First, let
*α*. Here,

is a diffusion process in
*linear noise approximation*

or in differential notation

where
*n*. Here, we have used the operator notation

Equation (2.21) is an infinite-dimensional stochastic differential equation with
additive (linear) noise. Here, additive means that the coefficient in the diffusion
term does not depend on the solution
*Langevin approximation*. Here, the dependence of the diffusion coefficient on the deterministic limit *ν* is formally substituted by a dependence on the solution. That is, we obtain a stochastic
partial differential equation with multiplicative noise given by

or in differential notation

Note that the derivation of the above equations was only formal, hence we have to
address the existence and uniqueness of solutions and the proper setting for these
equations. This is left for future work. It is an ongoing discussion and probably
undecidable as lacking a criterion of approximation quality which—if any at all—is
the correct diffusion approximation to use. First of all note that for both versions
the noise term vanishes for
*f* were linear. However, they are close to the mean of the discrete process. We discuss
this aspect in Appendix B.

Furthermore, we already observe in the central limit theorem, and thus also in the
linear noise and Langevin approximation that the covariance (2.18) or the drift and
the structure of the diffusion terms in (2.21) and (2.22), respectively, are independent
of objects resulting from the microscopic models. They are defined purely in terms
of the macroscopic limit. This observation supports the conjecture that these approximations
are independent from possible different microscopic models converging to the same
deterministic limit. Analogous statements hold also for derivations from the van Kampen
system size expansion [5] and in related limit theorems for reaction diffusion models [4,17,18]. The only object reminiscent of the microscopic models in the continuous approximations
is the re-scaling sequence

*Remark 2.4* The stochastic partial differential equations (2.21) and (2.22), which we proposed
as the linear noise or Langevin approximation, respectively, are not necessarily unique
as the representation of the limiting diffusion as a stochastic integral process (2.20)
may not be unique. It will be subject for further research efforts to analyse the
practical implications and usability of this Langevin approximation. Let *Q* be a trace class operator,
*Q*-Wiener process and let
^{∗} denotes the adjoint operator. Then also the stochastic integral process

is a version of the limiting diffusion in (2.3) and the corresponding linear noise and Langevin approximations are given by

and

We conclude this section by presenting one particular choice of a diffusion coefficient
and a Wiener process. We take
*j* is the embedding operator

We first investigate the operator

where *k* is the embedding operator

### 3 Discussion and Extensions

In this article, we have presented limit theorems that connect finite, discrete microscopic models of neural activity to the Wilson–Cowan neural field equation. The results state qualitative connections between the models formulated as precise probabilistic convergence concepts. Thus, the results strengthen the connection derived in a heuristic way from the van Kampen system size expansion.

A general limitation of mathematically precise approaches to approximations, cf. also the propagation to chaos limit theorems in [30], is that the microscopic models are usually defined via the limit. In other words, the limit has to be known a priori, and we look for models which converge to this limit. Thus, in contrast to the van Kampen system size expansion, the presented results are not a step-by-step modelling procedure in the sense that, via a constructive limiting procedure, a microscopic model yields a deterministic or stochastic approximation. Hence, it might be objected that the presented method can only be used a posteriori in order to justify a macroscopic model from a constructed microscopic model and that somehow one has to ‘guess’ the correct limit in advance. Several remarks can be made to answer this objection.

First, this observation is certainly true, but not necessarily a drawback. On the contrary, when both microscopic and macroscopic models are available, then it is rather important to know how these are connected and qualitatively and quantitatively characterise this connection. Concerning neural field models, this precise connection was simply not available so far for the well-established Wilson–Cowan model. Furthermore, when starting from a stochastic microscopic description working through proving the conditions for convergence for given microscopic models, one obtains very strong hints on the structure of a possible deterministic limit. Therefore, our results can also ease the procedure of ‘guessing the correct limit’.

Secondly, often a phenomenological, deterministic model, which is an approximation to an inherently probabilistic process is derived from ad-hoc heuristic arguments. Given that the model has proved useful, one often aims to derive a justification from first principles and/or a stochastic version, which keeps the features of the deterministic model, but also accounts for the formerly neglected fluctuations. A standard, though somewhat simple approach to obtain stochastic versions consists of adding (small) noise to the deterministic equations. This article, provides a second approach which consists of finding microscopic models, which converge to the deterministic limit to obtain a stochastic correction via a central limit argument.

Thirdly and finally, the method also provides an argument for new equations, i.e., the Langevin and linear noise approximations, which can be used to study the stochastic fluctuations in the model. Furthermore, in contrast to previous studies, we do not provide deterministic moment equations but stochastic processes, which can be, e.g., via Monte Carlo simulations, studied concerning a large number of pathwise properties and dynamics beyond first and second moments.

We now conclude this article commenting on the feasibility of our approach connecting microscopic Markov models to deterministic macroscopic equations when dealing with different master equation formulations that appear in the literature. Additionally, the following discussions also relate the model (1.6) considered in this article to other master equation formulations. We conjecture that the analogous results as presented for the Wilson–Cowan equation (1.3) in Sect. 2 also hold for these variations of the master equations. This should be possible to achieve by an adaptation of the methods of proof presented although we have not performed the computations in detail.

#### 3.1 A Variation of the Master Equation Formulation

A first variation of the discrete model we discussed in Sect. 1.2 was considered in
the articles [8,9] and a version restricted to a bounded state space also appears in [31]. This model consists of the master equation stated below in (3.2), which closely
resembles (1.6). In the earlier reference [8], the model was introduced with a different interpretation called the *effective spike model*. We briefly explain this interpretation before presenting the master equation. Instead
of interpreting *P* as the number of neuron populations, in this model, *P* denotes the number of different neurons in the network located within a spatial domain
*D*. Then
*k*th neuron, counts the number of ‘effective’ spikes this neuron has emitted in the
past up till time *t*. Effective spikes are those spikes that still influence the dynamics of the system,
e.g., via a post-synaptic potential. Then state transitions adding/subtracting one
effective spike for the *k*th neuron are governed by a firing rate function
*k*, and a decay rate
*τ* and the gain function is defined—neglecting external input—by

where
*not* equal to the gain function *f* in the proposed limiting Wilson–Cowan equation (1.3), but rather connected to *f* such that

The authors in [9] state that for any function *f* such a function

with boundary conditions

On the level of Markov jump processes, the master equation (3.2) obviously describes
dynamics similar to the master equation (1.6) only replacing the activation rate
*after* a limit procedure taking

It would be an interesting addition to the limit theorems in Theorem 2.1 to derive
a law of large numbers for the models (3.2) with stochastic mean activity

such that the higher order terms are uniformly bounded and vanish in the limit
*f* with

#### 3.2 Bounded State Space Master Equations

We have already stated when introducing the microscopic model in Sect. 1.2 that the
interpretation of the parameter
*k*th population is not literally correct. The state space of the process is unbounded,
hence arbitrarily many neurons can be active, and thus each population contains arbitrarily
many neurons. In order to overcome this interpretation problem, it was supposed to
consider the master equation only on a bounded state space. That is, the *k*th population consists of

A first master equation of this form was considered in [22], which in present notation, takes the form

Versions of such a master equation for, e.g., one population only or coupled inhibitory and excitatory populations were considered in [3,22], and a van Kampen systems size expansion was carried out. Here, the bound in the state space provides a natural parameter for the re-scaling, thus a small parameter for the expansion. The setup of this problem resembles closely the structure of excitable membranes for which limits have been obtained with the present technique by one of the present author and co-workers in [27]. Therefore, we conjecture that our limit theorems also apply to this setting with minor adaptations with essentially the same conditions and results as in Sect. 2. However, the macroscopic limit, which will be obtained does not conform with the Wilson–Cowan equation but will be given by

Next, we return to the master equation (1.6) as discussed in this article in Sect. 1.2
and the comment we made regarding bounded state spaces the footnote on page 7. In
our primary reference for this model [5], actually a bounded state space version of the master equation was considered where
the activation rate for the event

replacing

#### 3.3 Activity Based Neural Field Model

Finally, we return also to a difference in neural field theory mentioned in the beginning.
In contrast to rate-based neural field models of the Wilson–Cowan type (1.1), there
exists a second essential class of neural field models, so-called activity based models,
the prototype of which is the *Amari equation*

We conjecture that also for this type of equations a phenomenological microscopic model can be constructed with a suitable adaptation of the activation rates and that limit theorems analogous to the results in Sect. 2.1 hold. Then also a Langevin equation for this model can be obtained and used for further analysis.

### 4 Proofs of the Main Results

In this section, we present the proofs of the limit theorems. For the convenience
of the reader, as it is important tool in the subsequent proofs, we first state the
Poincaré inequality. Let

where
*ϕ* on the domain *D*, i.e.,

Moreover, the constant in the right-hand side of (4.1) is the optimal constant depending
only on the diameter of the domain *D*, cf. [1,23]. Whenever we omit to denote the spatial domain for definition of norms or inner products
in
*D*. If the norm is taken only over a subset

For the benefit of the reader, we next repeat the limiting equation

We denote by *F* the Nemytzkii operator on

and for all

Note that

Finally, another useful property is that the means of the process’ components are
bounded. For each *k*, *n* it holds that

see also (B.1). Therefore, it holds that

i.e.,

Here, we also used the assumption

#### 4.1 Proof of Theorem 2.1 (Law of Large Numbers)

In order to prove the law of large numbers, Theorem 2.1, we apply the law of large
numbers for Hilbert space valued PDMPs, see Theorem 4.1 in [27], to the sequence of homogeneous PDMPs

(LLN1) For fixed

(LLN2) The Nemytzkii operator *F* satisfies a Lipschitz condition in
*t*,

(LLN3) For fixed

Note that the final condition of Theorem 4.1 in [27], i.e., the convergence of the initial conditions, is satisfied by assumption. For a discussion of these conditions, we refer to [27] and proceed to their derivation for the present model in the subsequent parts (a) to (c).

(a) In order to prove condition (4.7), we write the integral with respect to the discrete
probability measure

where we have used the upper bound (4.6) on the expectation

(b) The Lipschitz condition (4.8) of the Nemytzkii operators is a straightforward
consequence of the Lipschitz continuity (1.4) of the gain function *f* as

Therefore, (4.8) holds with Lipschitz constant

(c) Finally, we prove the convergence of the generators (4.9). To this end, we employ
the characterisation of the norm in

Next, we apply the Nemytzkii operator *F* defined in (4.4) to
*ϕ* to obtain on the other hand

Subtracting (4.12) from (4.11), we obtain the integrated difference

We proceed to estimate the norm of the term in the right-hand side. We use the Lipschitz
condition (1.4) on *f*, the triangle inequality, and finally the Cauchy–Schwarz inequality on the resulting
second term to obtain the estimate

Here, the term in the right-hand side marked

We now consider the term marked

We next apply the Cauchy–Schwarz inequality to the integral inside the square brackets in the last term. Thus, we obtain the estimate

Now the Poincaré inequality (4.1) is applied to the innermost integral inside the square brackets, which yields

Finally, using once more the Cauchy–Schwarz inequality on the innermost summation we obtain

Now, a combination of the estimates (4.13) and (4.14) on the terms

Here, the right-hand side is independent of *ϕ*, hence taking the supremum over all *ϕ* with

Finally, integrating over

Here, we have used (4.6) and a combination of the Cauchy–Schwarz and Poincaré inequality (4.1) in order to estimate

The upper bound in (4.15) is of order