**A Basis for Statistical Theory and Quantum Theory**

**A Basis for Statistical Theory and Quantum Theory**

Inge S. Helland Additional information is available at the end of the chapter

Inge S. Helland

Additional information is available at the end of the chapter

http://dx.doi.org/10.5772/53702

### **1. Introduction**

Compaired to other physical theories, the foundation of quantum mechanics is very formal and abstract. The pure state of a system is defined as a complex vector (or ray) in some abstract vector space, the observables as Hermitian operators on this space. Even a modern textbook like Ballentine [1] starts by introducing two abstract postulates:


$$
\langle \mathbf{R} \rangle = \frac{\text{Tr}(\rho R)}{\text{Tr}(\rho)}.\tag{1}
$$

Here Tr is the trace operator. The discussion in [1] goes on by arguing that *R* must be Hermitian (have real eigenvalues) and that *ρ* ought to be positive with trace 1. An important special case is when *ρ* is one-dimensional: *ρ* = |*ψ*��*ψ*| for a vector |*ψ*�. Then the state is pure, and is equivalently specified by the vector |*ψ*�. In general the formula (1) is a consequence of Born's formula: The probability of observing a pure state |*φ*� when the system is prepared in a pure state |*ψ*� is given by |�*φ*|*ψ*�|2.

From these two postulates a very rich theory is deduced, a theory which has proved to be in agreement with observations in each case where it has been tested. Still, the abstract nature of the basic postulates leaves one a little uneasy: Is it possible to find another basis which is more directly connected to what one observes in nature? The purpose of this chapter is to show that to a large extent one can give a positive answer to this question.

Commons Attribution License (http://creativecommons.org/licenses/by/3.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. © 2013 Helland; licensee InTech. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/3.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. © 2013 Helland; licensee InTech. This is a paper distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/3.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

©2012 Helland, licensee InTech. This is an open access chapter distributed under the terms of the Creative

Another problem is that there are many interpretations of quantum mechanics. In this chapter I will choose an *epistemic* interpretation: Quantum mechanics is related to the knowledge we get about nature, not directly to how nature 'is'. The latter aspect - the ontological aspect of nature - is something we can talk about when all observers agree on the same information. Any knowledge about nature is found through an epistemic process - an experiment or an observational study. Typically we ask a question: What is *θ*? And after the epistemic process is completed, nature gives an answer, in the simplest case: *θ* = *uk*, where *uk* is one of several possible values. Here *θ* is what we will call an *epistemic conceptual variable* or *e-variable*, a variable defined by an observer or by a group of observers and defining the epistemic process.

*Definition 1.*

*independent of θ.*

*The sufficiency principle.*

obvious by most statisticians.

*The conditionality principle 1.*

*experiment actually performed.*

*The conditionality principle 2.*

*outcome of the coin toss.*

*The likelihood principle.*

experiment and *<sup>z</sup>*<sup>∗</sup>

between them: *<sup>τ</sup>*′ <sup>=</sup> *<sup>f</sup>*(*τ*); *<sup>τ</sup>* <sup>=</sup> *<sup>f</sup>* <sup>−</sup>1(*τ*′

From these examples one can also deduce

principle and the conditionality principle 1 together imply

*in both experiments. Suppose that the two observations z*<sup>∗</sup>

<sup>1</sup> and *<sup>z</sup>*<sup>∗</sup>

variable of interest. This is the basis for

*contain the same experimental evidence about θ in the context τ.*

*We say that t* = *t*(*z*) *is a τ-sufficient for θ if the conditional distribution of z, given t*, *τ and θ is*

The intuitive notion here is that if the distribution of *z*, given *t* is independent of *θ*, the distribution of the whole data set might as well be generated by the distribution of *t*, given *θ* together with some random mechanism which is totally independent of the conceptual

*Consider an experiment in a context τ, let z be the data of this experiment, and let θ be the e-variable of interest. Let t* = *t*(*z*) *be a τ-sufficient statistic for θ. Then, if t*(*z*1) = *t*(*z*2)*, the data z*<sup>1</sup> *and z*<sup>2</sup>

Here 'experimental evidence' is left undefined. The principle is regarded as intuitively

*Suppose that there are two experiments E*<sup>1</sup> *and E*<sup>2</sup> *with common conceptual variable of interest θ and with equivalent contexts <sup>τ</sup>. Consider a mixed experiment E*∗*, whereby u* <sup>=</sup> <sup>1</sup> *or u* <sup>=</sup> <sup>2</sup> *is observed, each having probability* 1/2 *(independent of θ, the data of the experiments and the contexts), and the experiment Eu is then performed. Then the evidence about <sup>θ</sup> from E*<sup>∗</sup> *is just the evidence from the*

Two contexts *<sup>τ</sup>* and *<sup>τ</sup>*′ are defined to be equivalent if there is a one-to-one correspondence

*In the situation of conditionality principle 1 one should in any statistical analysis condition upon the*

It caused much discussion among statisticians when Birnbaum [3] proved that the sufficiency

*Consider two experiments with equivalent contexts τ, and assume that θ is the same full e-variable*

It is crucial for the present chapter that these principles may be generalized from experiments

An important special case of the likelihood principle is when *E*<sup>1</sup> and *E*<sup>2</sup> are the same

any experimental evidence on *θ* must only depend on the likelihood (given the context). Without taking the context into account this is really controversial. It seems like common statistical methods like confidence intervals and test of hypotheses are excluded. But this is

*two experiments. Then these two observations produce the same evidence on θ in this context.*

to any epistemic processes involving data such that 1) and 2) are satisfied.

<sup>1</sup> *and z*<sup>∗</sup>

<sup>2</sup> have equal likelihoods. Then the likelihood principle says that

). The principle can be motivated by simple examples.

A Basis for Statistical Theory and Quantum Theory

http://dx.doi.org/10.5772/53702

337

<sup>2</sup> *have proportional likelihoods in the*

Another principle which is concidered intuitively obvious by most statisticians, is

In all empirical sciences, epistemic questions like this are posed to nature. It is well known that the answers are not always that simple. Typically we end up with a confidence interval (a frequentist concept) or a credibility interval (a Bayesian concept) for *θ*. This leads us into statistical science. In statistics, *θ* is most often called a parameter, and is often connected to a population of experimental units. But there are instances also in statistics where we want to predict a value for a single unit. The corresponding intervals are then called prediction intervals. In this chapter we will also use *θ* for an unknown variable for a single unit, which is a situation very often met in physics. This is the generalization we think about when we in general call *θ* an e-variable, not a parameter. Also, the notion of a parameter may have a different meaning in physics, so by this we will avoid confusion.

A more detailed discussion than what can be covered here, can be found in Helland [2].

### **2. A basis for statistics**

Every experiment or observational study is made in a context. Part of the context may be physical, another part may be historical, including earlier experiments. Also, the status of the observer(s) may be seen as a part of the context, and another part of the context may be conceptual, including a goal for the study. In all our discussion, we assume that we have conditioned upon the context *τ*. We can imagine the context formulated as a set of propositions. But propositional calculus corresponds to set theory, as both are Boolean algebras. Therefore we can here in principle use the familiar concept of conditioning as developed in Kolmogorov's theory of probability, where it is defined as a Radon-Nikodym derivative. Readers unfamiliar to this mathematics may think of a more intuitive conditioning concept.

In addition, for every experiment, we have an e-variable of interest *θ* and we have data *z*. A basis for all statistical theory is the statistical model, the distribution of *z* as a function of *θ*. Conceptual variables which are not of interest, may be taken as part of the context *τ*. The density of the statistical model, seen as a function of *θ*, is called the likelihood. We will assume throughout:

1) The distribution of *z*, given *τ*, depends on an unknown e-variable *θ*.

2) If *τ* or part of *τ* has a distribution, this is independent of *θ*. The part of *τ* which does not have a distribution is functionally independent of *θ*.

A function of the data is called a statistic *t*(*z*). Often it is of interest to reduce the data to a sufficient statistic, a concept due to R. A. Fisher.

*Definition 1.*

2 Quantum Mechanics

epistemic process.

**2. A basis for statistics**

concept.

assume throughout:

Another problem is that there are many interpretations of quantum mechanics. In this chapter I will choose an *epistemic* interpretation: Quantum mechanics is related to the knowledge we get about nature, not directly to how nature 'is'. The latter aspect - the ontological aspect of nature - is something we can talk about when all observers agree on the same information. Any knowledge about nature is found through an epistemic process - an experiment or an observational study. Typically we ask a question: What is *θ*? And after the epistemic process is completed, nature gives an answer, in the simplest case: *θ* = *uk*, where *uk* is one of several possible values. Here *θ* is what we will call an *epistemic conceptual variable* or *e-variable*, a variable defined by an observer or by a group of observers and defining the

In all empirical sciences, epistemic questions like this are posed to nature. It is well known that the answers are not always that simple. Typically we end up with a confidence interval (a frequentist concept) or a credibility interval (a Bayesian concept) for *θ*. This leads us into statistical science. In statistics, *θ* is most often called a parameter, and is often connected to a population of experimental units. But there are instances also in statistics where we want to predict a value for a single unit. The corresponding intervals are then called prediction intervals. In this chapter we will also use *θ* for an unknown variable for a single unit, which is a situation very often met in physics. This is the generalization we think about when we in general call *θ* an e-variable, not a parameter. Also, the notion of a parameter may have a

A more detailed discussion than what can be covered here, can be found in Helland [2].

Every experiment or observational study is made in a context. Part of the context may be physical, another part may be historical, including earlier experiments. Also, the status of the observer(s) may be seen as a part of the context, and another part of the context may be conceptual, including a goal for the study. In all our discussion, we assume that we have conditioned upon the context *τ*. We can imagine the context formulated as a set of propositions. But propositional calculus corresponds to set theory, as both are Boolean algebras. Therefore we can here in principle use the familiar concept of conditioning as developed in Kolmogorov's theory of probability, where it is defined as a Radon-Nikodym derivative. Readers unfamiliar to this mathematics may think of a more intuitive conditioning

In addition, for every experiment, we have an e-variable of interest *θ* and we have data *z*. A basis for all statistical theory is the statistical model, the distribution of *z* as a function of *θ*. Conceptual variables which are not of interest, may be taken as part of the context *τ*. The density of the statistical model, seen as a function of *θ*, is called the likelihood. We will

2) If *τ* or part of *τ* has a distribution, this is independent of *θ*. The part of *τ* which does not

A function of the data is called a statistic *t*(*z*). Often it is of interest to reduce the data to a

1) The distribution of *z*, given *τ*, depends on an unknown e-variable *θ*.

have a distribution is functionally independent of *θ*.

sufficient statistic, a concept due to R. A. Fisher.

different meaning in physics, so by this we will avoid confusion.

*We say that t* = *t*(*z*) *is a τ-sufficient for θ if the conditional distribution of z, given t*, *τ and θ is independent of θ.*

The intuitive notion here is that if the distribution of *z*, given *t* is independent of *θ*, the distribution of the whole data set might as well be generated by the distribution of *t*, given *θ* together with some random mechanism which is totally independent of the conceptual variable of interest. This is the basis for

#### *The sufficiency principle.*

*Consider an experiment in a context τ, let z be the data of this experiment, and let θ be the e-variable of interest. Let t* = *t*(*z*) *be a τ-sufficient statistic for θ. Then, if t*(*z*1) = *t*(*z*2)*, the data z*<sup>1</sup> *and z*<sup>2</sup> *contain the same experimental evidence about θ in the context τ.*

Here 'experimental evidence' is left undefined. The principle is regarded as intuitively obvious by most statisticians.

Another principle which is concidered intuitively obvious by most statisticians, is

*The conditionality principle 1.*

*Suppose that there are two experiments E*<sup>1</sup> *and E*<sup>2</sup> *with common conceptual variable of interest θ and with equivalent contexts <sup>τ</sup>. Consider a mixed experiment E*∗*, whereby u* <sup>=</sup> <sup>1</sup> *or u* <sup>=</sup> <sup>2</sup> *is observed, each having probability* 1/2 *(independent of θ, the data of the experiments and the contexts), and the experiment Eu is then performed. Then the evidence about <sup>θ</sup> from E*<sup>∗</sup> *is just the evidence from the experiment actually performed.*

Two contexts *<sup>τ</sup>* and *<sup>τ</sup>*′ are defined to be equivalent if there is a one-to-one correspondence between them: *<sup>τ</sup>*′ <sup>=</sup> *<sup>f</sup>*(*τ*); *<sup>τ</sup>* <sup>=</sup> *<sup>f</sup>* <sup>−</sup>1(*τ*′ ). The principle can be motivated by simple examples. From these examples one can also deduce

*The conditionality principle 2.*

*In the situation of conditionality principle 1 one should in any statistical analysis condition upon the outcome of the coin toss.*

It caused much discussion among statisticians when Birnbaum [3] proved that the sufficiency principle and the conditionality principle 1 together imply

*The likelihood principle.*

*Consider two experiments with equivalent contexts τ, and assume that θ is the same full e-variable in both experiments. Suppose that the two observations z*<sup>∗</sup> <sup>1</sup> *and z*<sup>∗</sup> <sup>2</sup> *have proportional likelihoods in the two experiments. Then these two observations produce the same evidence on θ in this context.*

It is crucial for the present chapter that these principles may be generalized from experiments to any epistemic processes involving data such that 1) and 2) are satisfied.

An important special case of the likelihood principle is when *E*<sup>1</sup> and *E*<sup>2</sup> are the same experiment and *<sup>z</sup>*<sup>∗</sup> <sup>1</sup> and *<sup>z</sup>*<sup>∗</sup> <sup>2</sup> have equal likelihoods. Then the likelihood principle says that any experimental evidence on *θ* must only depend on the likelihood (given the context). Without taking the context into account this is really controversial. It seems like common statistical methods like confidence intervals and test of hypotheses are excluded. But this is saved when we can take confidence levels, alternative hypotheses, test levels etc. as part of the context.

**4. The maximal symmetrical epistemic setting**

maximal group with respect to which it is permissible.

assumptions about *G*:

measure *ρ* exists on the space Φ of *φ*'s.

reflection together with the identity.

there be an invertible transformation *gab* such that *λb*(*φ*) = *λa*(*gab*(*φ*)).

A general setting will be descibed, and then I will show that spin and angular momentum are special cases of this setting. This is called the maximal symmetrical epistemic setting. Consider an inaccessible conceptual variable *φ*, and let there be accessible e-variables *λa*(*φ*) (*a* ∈ A) indexed by some set A. Thus for each *a*, one can ask the question: What is the value of *λa*? and get some information from experiment. To begin with, assume that these are maximally accessible, more precisely maximal in the ordering where *α* < *β* when *α* = *f*(*β*) for some *f* . This can be assumed by Zorn's lemma, but it will later be relaxed. For *a* � *b* let

A Basis for Statistical Theory and Quantum Theory

http://dx.doi.org/10.5772/53702

339

In general, let a group *H* act on a conceptual variable *φ*. A function *η*(*φ*) is said to be permissible with respect to *<sup>H</sup>* if *<sup>η</sup>*(*φ*1) = *<sup>η</sup>*(*φ*2) implies *<sup>η</sup>*(*hφ*1) = *<sup>η</sup>*(*hφ*2) for all *<sup>h</sup>* ∈ *<sup>H</sup>*. Then one can define a corresponding group *H*˜ acting upon *η*. For a given function *η*(*φ*) there is a

Now fix 0 ∈ A and let *G*<sup>0</sup> be the maximal group under which *λ*0(*φ*) is permissible. Take *G<sup>a</sup>* = *ga*0*G*0*g*0*a*, and let *G* be the smallest group containing *G*<sup>0</sup> and all the transformations *ga*0. It is then easy to see that *G<sup>a</sup>* is the maximal group under which *λa*(*φ*) is permissible, and that *<sup>G</sup>* is the group generated by *<sup>G</sup>a*; *<sup>a</sup>* ∈ A and the transformations *gab*. Make the following

a) It is a locally compact topological group satisfying weak conditions such that an invariant

b) *λa*(*φ*) varies over an orbit or a set of orbits of the smaller group *Ga*. More precisely: *λ<sup>a</sup>*

As an important example, let *φ* be the spin vector or the angular momentum vector for a particle or a system of particles. Let *G* be the group of rotations of the vector *φ*, that is, the group which fixes the norm �*φ*�. Next, choose a direction *a* i space, and focus upon the spin component in this direction: *ζ<sup>a</sup>* = �*φ*�cos(*φ*, *a*). The largest subgroup *G<sup>a</sup>* with respect to which *ζa*(*φ*) is permissible, is given by rotations around *a* together with a reflection in a plane pependicular to *a*. However, the action of the corresponding group *G*˜ *<sup>a</sup>* on *ζ<sup>a</sup>* is just a

Finally introduce model reduction. As mentioned at the end of the previous section, such a model reduction for *ζ<sup>a</sup>* should be to an orbit or to a set of orbits for the group *G*˜ *<sup>a</sup>* as acting on *ζa*. These orbits are given as two-point sets ±*c* together with the single point 0. To conform to the ordinary theory of spin/angular momentum, I will choose the set of orbits indexed by an integer or half-integer *j* and let the reduced set of orbits be −*j*, −*j* + 1, ..., *j* − 1, *j*. Letting *λ<sup>a</sup>* be the e-variable *ζ<sup>a</sup>* reduced to this set of orbits of *G*˜ *<sup>a</sup>*, and assuming it to be a maximally accessible e-variable, we can prove the general assumptions of the maximal symmetrical epistemic setting (except for the case *j* = 0, where we must redefine *G* to be the trivial group). For instance, here is an indication of the proof leading to assumption c) above: given *a* and *b*, a transformation *gab* sending *λa*(*φ*) onto *λb*(*φ*) can be obtained by a reflection in a plane orthogonal to the two vectors *a* and *b*, a plane containing the midline between *a* and *b*. The case with one orbit and *c* = *j* = 1/2 corresponds to electrons and other spin 1/2 particles.

varies over an orbit or a set of orbits of the corresponding group *G*˜ *<sup>a</sup>* on its range.

c) *G* is generated by the product of elements of *Ga*, *Gb*, ...; *a*, *b*, ... ∈ A.

A discussion of these common statistical methods will not be included here; the reader is referred to [2] for this. Also, a discussion of the important topic of model reduction in statistics will be omitted here. Sometimes a statistical model contains more structure than what has been assumed here; for instance group actions may be defined on the space of e-variables. Then any model reduction should be to an orbit or to a set of orbits for the group; for examples, see [2].

### **3. Inaccessible conceptual variables and quantum theory**

An e-variable as it is used here is related to the question posed in an epistemic process: What is the value of *θ*? Sometimes we can obtain an accurate answer to such a question, sometimes not. We call *θ* accessible if we in principle can devise an experiment such that *θ* can be assessed with arbitrary accuracy. If this in principle is impossible, we say that *θ* is inaccessible.

Consider a single medical patient which at time *t* = 0 can be given one out of two mutually exclusive treatments A or B. The time *<sup>θ</sup><sup>A</sup>* until recovery given treatment A can be measured accurately by giving this treatment and waiting a sufficiently long time, likewise the time *<sup>θ</sup><sup>B</sup>* until recovery given treatment B. But consider the vector *<sup>φ</sup>* = (*θA*, *<sup>θ</sup>B*). This vector can not be assessed with arbitrary accuracy by any person neither before, during nor after treatment. The vector *φ* is inaccessible. A similar phenomenon occurs in all counterfactual situations.

Many more situations with inaccessible conceptual variables can be devised. Consider a fragile apparatus which is destroyed after a single measurement of some quantity *θ*1, and let *θ*<sup>2</sup> be another quantity which can only be measured by dismantling the apparatus. Then *φ* = (*θ*1, *θ*2) is inaccessible. Or consider two sensitive questions to be posed to a single person at some moment of time, where we expect that the order in which the questions are posed may be relevant for the answers. Let (*θ*1, *θ*2) be the answers when the questions are posed in one order, and let (*θ*3, *θ*4) be the answers when the questions are posed in the opposite order. Then the vector *φ* = (*θ*1, *θ*2, *θ*3, *θ*4) is inaccessible.

I will approach quantum mechanics by looking upon it as an epistemic science and pointing out the different inaccessible conceptual variables. First, by Heisenberg's uncertainty principle, the vector (*ξ*, *π*) is inaccessible, where *ξ* is the theoretical position and *π* is the theoretical momentum of a particle. This implies that (*ξ*(*t*1), *ξ*(*t*2)), the positions at two different times, is an inaccessible vector. Hence the trajectory of the particle is inaccessible. In the two-slit experiment (*α*, *θ*) is inaccessible, where *α* denotes the slit that the particle goes through, and *θ* is the phase of the particle's wave as it hits the screen.

In this chapter I will pay particular attention to a particle's spin/ angular momentum. The spin or angular momentum vector is inaccessible, but its component *<sup>λ</sup><sup>a</sup>* in any chosen direction *a* will be accessible.

It will be crucial for my discussion that even though a vector is inaccessible, it can be seen upon as an abstract quantity taking values in some space and one can often act on it by group actions. Thus in the medical example which started this section, a change of time units will affect the whole vector *φ*, and a spin vector can be acted upon by rotations.

### **4. The maximal symmetrical epistemic setting**

4 Quantum Mechanics

the context.

inaccessible.

group; for examples, see [2].

saved when we can take confidence levels, alternative hypotheses, test levels etc. as part of

A discussion of these common statistical methods will not be included here; the reader is referred to [2] for this. Also, a discussion of the important topic of model reduction in statistics will be omitted here. Sometimes a statistical model contains more structure than what has been assumed here; for instance group actions may be defined on the space of e-variables. Then any model reduction should be to an orbit or to a set of orbits for the

An e-variable as it is used here is related to the question posed in an epistemic process: What is the value of *θ*? Sometimes we can obtain an accurate answer to such a question, sometimes not. We call *θ* accessible if we in principle can devise an experiment such that *θ* can be assessed with arbitrary accuracy. If this in principle is impossible, we say that *θ* is

Consider a single medical patient which at time *t* = 0 can be given one out of two mutually exclusive treatments A or B. The time *<sup>θ</sup><sup>A</sup>* until recovery given treatment A can be measured accurately by giving this treatment and waiting a sufficiently long time, likewise the time *<sup>θ</sup><sup>B</sup>* until recovery given treatment B. But consider the vector *<sup>φ</sup>* = (*θA*, *<sup>θ</sup>B*). This vector can not be assessed with arbitrary accuracy by any person neither before, during nor after treatment. The vector *φ* is inaccessible. A similar phenomenon occurs in all counterfactual situations. Many more situations with inaccessible conceptual variables can be devised. Consider a fragile apparatus which is destroyed after a single measurement of some quantity *θ*1, and let *θ*<sup>2</sup> be another quantity which can only be measured by dismantling the apparatus. Then *φ* = (*θ*1, *θ*2) is inaccessible. Or consider two sensitive questions to be posed to a single person at some moment of time, where we expect that the order in which the questions are posed may be relevant for the answers. Let (*θ*1, *θ*2) be the answers when the questions are posed in one order, and let (*θ*3, *θ*4) be the answers when the questions are posed in the opposite order.

I will approach quantum mechanics by looking upon it as an epistemic science and pointing out the different inaccessible conceptual variables. First, by Heisenberg's uncertainty principle, the vector (*ξ*, *π*) is inaccessible, where *ξ* is the theoretical position and *π* is the theoretical momentum of a particle. This implies that (*ξ*(*t*1), *ξ*(*t*2)), the positions at two different times, is an inaccessible vector. Hence the trajectory of the particle is inaccessible. In the two-slit experiment (*α*, *θ*) is inaccessible, where *α* denotes the slit that the particle goes

In this chapter I will pay particular attention to a particle's spin/ angular momentum. The spin or angular momentum vector is inaccessible, but its component *<sup>λ</sup><sup>a</sup>* in any chosen

It will be crucial for my discussion that even though a vector is inaccessible, it can be seen upon as an abstract quantity taking values in some space and one can often act on it by group actions. Thus in the medical example which started this section, a change of time units will

**3. Inaccessible conceptual variables and quantum theory**

Then the vector *φ* = (*θ*1, *θ*2, *θ*3, *θ*4) is inaccessible.

direction *a* will be accessible.

through, and *θ* is the phase of the particle's wave as it hits the screen.

affect the whole vector *φ*, and a spin vector can be acted upon by rotations.

A general setting will be descibed, and then I will show that spin and angular momentum are special cases of this setting. This is called the maximal symmetrical epistemic setting.

Consider an inaccessible conceptual variable *φ*, and let there be accessible e-variables *λa*(*φ*) (*a* ∈ A) indexed by some set A. Thus for each *a*, one can ask the question: What is the value of *λa*? and get some information from experiment. To begin with, assume that these are maximally accessible, more precisely maximal in the ordering where *α* < *β* when *α* = *f*(*β*) for some *f* . This can be assumed by Zorn's lemma, but it will later be relaxed. For *a* � *b* let there be an invertible transformation *gab* such that *λb*(*φ*) = *λa*(*gab*(*φ*)).

In general, let a group *H* act on a conceptual variable *φ*. A function *η*(*φ*) is said to be permissible with respect to *<sup>H</sup>* if *<sup>η</sup>*(*φ*1) = *<sup>η</sup>*(*φ*2) implies *<sup>η</sup>*(*hφ*1) = *<sup>η</sup>*(*hφ*2) for all *<sup>h</sup>* ∈ *<sup>H</sup>*. Then one can define a corresponding group *H*˜ acting upon *η*. For a given function *η*(*φ*) there is a maximal group with respect to which it is permissible.

Now fix 0 ∈ A and let *G*<sup>0</sup> be the maximal group under which *λ*0(*φ*) is permissible. Take *G<sup>a</sup>* = *ga*0*G*0*g*0*a*, and let *G* be the smallest group containing *G*<sup>0</sup> and all the transformations *ga*0. It is then easy to see that *G<sup>a</sup>* is the maximal group under which *λa*(*φ*) is permissible, and that *<sup>G</sup>* is the group generated by *<sup>G</sup>a*; *<sup>a</sup>* ∈ A and the transformations *gab*. Make the following assumptions about *G*:

a) It is a locally compact topological group satisfying weak conditions such that an invariant measure *ρ* exists on the space Φ of *φ*'s.

b) *λa*(*φ*) varies over an orbit or a set of orbits of the smaller group *Ga*. More precisely: *λ<sup>a</sup>* varies over an orbit or a set of orbits of the corresponding group *G*˜ *<sup>a</sup>* on its range.

c) *G* is generated by the product of elements of *Ga*, *Gb*, ...; *a*, *b*, ... ∈ A.

As an important example, let *φ* be the spin vector or the angular momentum vector for a particle or a system of particles. Let *G* be the group of rotations of the vector *φ*, that is, the group which fixes the norm �*φ*�. Next, choose a direction *a* i space, and focus upon the spin component in this direction: *ζ<sup>a</sup>* = �*φ*�cos(*φ*, *a*). The largest subgroup *G<sup>a</sup>* with respect to which *ζa*(*φ*) is permissible, is given by rotations around *a* together with a reflection in a plane pependicular to *a*. However, the action of the corresponding group *G*˜ *<sup>a</sup>* on *ζ<sup>a</sup>* is just a reflection together with the identity.

Finally introduce model reduction. As mentioned at the end of the previous section, such a model reduction for *ζ<sup>a</sup>* should be to an orbit or to a set of orbits for the group *G*˜ *<sup>a</sup>* as acting on *ζa*. These orbits are given as two-point sets ±*c* together with the single point 0. To conform to the ordinary theory of spin/angular momentum, I will choose the set of orbits indexed by an integer or half-integer *j* and let the reduced set of orbits be −*j*, −*j* + 1, ..., *j* − 1, *j*. Letting *λ<sup>a</sup>* be the e-variable *ζ<sup>a</sup>* reduced to this set of orbits of *G*˜ *<sup>a</sup>*, and assuming it to be a maximally accessible e-variable, we can prove the general assumptions of the maximal symmetrical epistemic setting (except for the case *j* = 0, where we must redefine *G* to be the trivial group). For instance, here is an indication of the proof leading to assumption c) above: given *a* and *b*, a transformation *gab* sending *λa*(*φ*) onto *λb*(*φ*) can be obtained by a reflection in a plane orthogonal to the two vectors *a* and *b*, a plane containing the midline between *a* and *b*.

The case with one orbit and *c* = *j* = 1/2 corresponds to electrons and other spin 1/2 particles.

In general, assumption b) in the maximal symmetrical epistemic setting may be motivated in a similar manner: First, a conceptual variable *ζ<sup>a</sup>* is introduced for each *a* through a chosen focusing, then define *G<sup>a</sup>* as the maximal group under which *ζa*(*φ*) is permissible, with *G*˜ *<sup>a</sup>* being the corresponding group acting on *ζa*. Finally define *λ<sup>a</sup>* as the reduction of *ζ<sup>a</sup>* to a set of orbits of *G*˜ *<sup>a</sup>*. The content of assumption b) is that it is *this λ<sup>a</sup>* which is maximally accessible. This may be regarded as the quantum hypothesis.

I will also introduce operators by

*A<sup>a</sup>* =

with information about all possible states connected to this variable.

*f* ∈ *L*2(Φ, *ρ*) and given *g* ∈ *G* we define a new function *U*(*g*)*f* by

Without proof I mention 5 properties of the set of operators *U*(*g*):

is a mapping *<sup>k</sup>* <sup>→</sup> *<sup>k</sup>*′ between groups *<sup>K</sup>* and *<sup>K</sup>*′ such that *<sup>k</sup>*<sup>1</sup> <sup>→</sup> *<sup>k</sup>*′

*Let Ua* = *U*(*g*0*a*) *with gab defined in the beginning of Section 4. Then*

• *U*(*g*) is linear: *U*(*g*)(*a*<sup>1</sup> *f*<sup>1</sup> + *a*<sup>2</sup> *f*2) = *a*1*U*(*g*)*f*<sup>1</sup> + *a*2*U*(*g*)*f*2. • *<sup>U</sup>*(*g*) is unitary: �*U*(*g*)*f*1, *<sup>f</sup>*2� <sup>=</sup> �*f*1, *<sup>U</sup>*(*g*)−<sup>1</sup> *<sup>f</sup>*2� in *<sup>L</sup>*2(Φ, *<sup>ρ</sup>*).

• *<sup>U</sup>*(*g*) is bounded: sup*<sup>f</sup>* :� *<sup>f</sup>* �=1�*U*(*g*)*<sup>f</sup>* � <sup>=</sup> <sup>1</sup> <sup>&</sup>lt; <sup>∞</sup>.

in the topology of bounded linear operators).

element.

1*k*′

*<sup>k</sup>*1*k*<sup>2</sup> <sup>→</sup> *<sup>k</sup>*′

*Proposition 1.*

*K* ∑ *k*=1

where �*a*; *k*| is the bra vector corresponding to |*a*; *k*�. This is by definition the observator corresponding to the e-variable *λa*. Since *λ<sup>a</sup>* is maximal, *A<sup>a</sup>* will have non-degenerate eigenvalues *uk*. Knowing *Aa*, we will have information of all possible values of *λ<sup>a</sup>* together

The rest of this section will be devoted to proving (2) and showing the properties of the state vectors |*a*; *k*�. To allow for future generalizations I now allow the accessible e-variables *λ<sup>a</sup>* to take any set of values, continuous or discrete. The discussion will by necessity be a bit technical. First I define the (left) regular representation *U* for a group *G*. For given

• *<sup>U</sup>*(·) is continuous: If lim *gn* = *<sup>g</sup>*<sup>0</sup> in the group topology, then lim *<sup>U</sup>*(*gn*) = *<sup>U</sup>*(*g*0) (in the matrix norm in the finite-dimensional case, which is what I will focus on here, in general

• *<sup>U</sup>*(·) is a homomorphism: *<sup>U</sup>*(*g*1*g*2) = *<sup>U</sup>*(*g*1)*U*(*g*2) for all *<sup>g</sup>*1, *<sup>g</sup>*<sup>2</sup> and *<sup>U</sup>*(*e*) = *<sup>I</sup>* for the unit

The concept of *homomorphism* will be crucial in this section. In general, a homomorphism

A *representation* of a group *K* is a continuous homomorphism from *K* into a group of invertible operators on some vector space. If the vector space is finite dimensional, the linear operators can be taken as matrices. There is a large and useful mathematical theory about operator (matrix) representations of groups; some of it is sketched in Appendix 3 of [2]. Equation (3)

*<sup>H</sup><sup>a</sup>* <sup>=</sup> *<sup>U</sup>*−<sup>1</sup> *<sup>a</sup> H through r*(*λa*(*φ*)) = *<sup>U</sup>*−<sup>1</sup> *<sup>a</sup> <sup>r</sup>*(*λ*0(*φ*)).

<sup>2</sup> and such that *<sup>e</sup>* <sup>→</sup> *<sup>e</sup>*′ for the identities. Then also *<sup>k</sup>*−<sup>1</sup> <sup>→</sup> (*k*′

gives one such representation of the basic group *G* on the vector space *L*2(Φ, *ρ*).

*uk*|*a*; *<sup>k</sup>*��*a*; *<sup>k</sup>*|,

*<sup>U</sup>*(*g*)*f*(*φ*) = *<sup>f</sup>*(*g*−1*φ*). (3)

A Basis for Statistical Theory and Quantum Theory

http://dx.doi.org/10.5772/53702

341

<sup>1</sup> and *<sup>k</sup>*<sup>2</sup> <sup>→</sup> *<sup>k</sup>*′

)−<sup>1</sup> when *<sup>k</sup>* → *<sup>k</sup>*′

<sup>2</sup> implies

.

#### **5. Hilbert space, pure states and operators**

Consider the maximal symmetrical epistemic setting. The crucial step towards the formalism of quantum mechanics is to define a Hilbert space, that is, a complete inner product space which can serve as a state space for the system.

By assumption a) there exists an invariant measure *ρ* for the group's action: *ρ*(*gA*) = *ρ*(*A*) for all *g* ∈ *G* and all Borel-measurable subsets *A* of the space Φ of inaccessible conceptual variables. If *G* is transitive on Φ, then *ρ* is unique up to a multiplicative constant. For compact groups *ρ* can be normalized, i.e., taken as a probability measure. For each *a* define

$$H^a = \{ f \in L^2(\Phi, \rho) : \ f(\phi) = r(\lambda^a(\phi)) \text{ for some function } r. \}$$

Thus *H<sup>a</sup>* is the set of *L*2-functions that are functions of *λa*(*φ*). Since *H<sup>a</sup>* is a closed subspace of the Hilbert space *L*2(Φ, *ρ*), it is itself a Hilbert space. To define our state space *H*, we now fix an arbitrary index *a* = 0 ∈ A, and take

$$H = H^0.$$

First look at the case where the accessible e-variables take a finite, discrete set of values. Let {*uk*} be the set of possible values of *<sup>λ</sup>a*. Since *<sup>λ</sup>a*(·) is maximal, {*uk*} can be taken to be independent of *a*, see [2]. Now go back to the definition of an epistemic process: We start by choosing *a*, that is, ask an epistemic question: What is the value of *λa*? After the process we get some information; I will here look upon the simple case where we get full knowledge: *λ<sup>a</sup>* = *uk*. I define this as a pure state of the system; it can be characterized by the indicator function **1**(*λa*(*φ*) = *uk*). This is a function in *Ha*, but I will show below that one can find an invertible operator *V<sup>a</sup>* such that

$$f\_k^a(\phi) = V^a \mathbf{1}(\lambda^a(\phi) = u\_k) \tag{2}$$

is a unique function in *H* = *H*0. Since *H* in this case is a *K*-dimensional vector space, where *K* is the number of values *uk*, we can regard *f <sup>a</sup> <sup>k</sup>* as a *K*-dimensional vector. To conform to the ordinary quantum mechanical notation, I write this as a ket-vector |*a*; *k*� = *f <sup>a</sup> <sup>k</sup>* . It is easy to see that {|0; *k*�; *k* = 1, ..., *K*} is an orthonormal basis of *H* when *ρ* is normalized to be 1 for the whole space Φ. I will show below that {|*a*; *k*�; *k* = 1, ..., *K*} has the same property. My main point is that |*a*; *k*� is characterized by and characterizes a question: What is *λa*? together with an answer: *λ<sup>a</sup>* = *uk*. This is a pure state for the maximal symmetrical epistemic setting.

I will also introduce operators by

6 Quantum Mechanics

In general, assumption b) in the maximal symmetrical epistemic setting may be motivated in a similar manner: First, a conceptual variable *ζ<sup>a</sup>* is introduced for each *a* through a chosen focusing, then define *G<sup>a</sup>* as the maximal group under which *ζa*(*φ*) is permissible, with *G*˜ *<sup>a</sup>* being the corresponding group acting on *ζa*. Finally define *λ<sup>a</sup>* as the reduction of *ζ<sup>a</sup>* to a set of orbits of *G*˜ *<sup>a</sup>*. The content of assumption b) is that it is *this λ<sup>a</sup>* which is maximally

Consider the maximal symmetrical epistemic setting. The crucial step towards the formalism of quantum mechanics is to define a Hilbert space, that is, a complete inner product space

By assumption a) there exists an invariant measure *ρ* for the group's action: *ρ*(*gA*) = *ρ*(*A*) for all *g* ∈ *G* and all Borel-measurable subsets *A* of the space Φ of inaccessible conceptual variables. If *G* is transitive on Φ, then *ρ* is unique up to a multiplicative constant. For compact

*H<sup>a</sup>* = { *f* ∈ *L*2(Φ, *ρ*) : *f*(*φ*) = *r*(*λa*(*φ*)) for some function *r*.}

Thus *H<sup>a</sup>* is the set of *L*2-functions that are functions of *λa*(*φ*). Since *H<sup>a</sup>* is a closed subspace of the Hilbert space *L*2(Φ, *ρ*), it is itself a Hilbert space. To define our state space *H*, we now

*H* = *H*0.

First look at the case where the accessible e-variables take a finite, discrete set of values. Let {*uk*} be the set of possible values of *<sup>λ</sup>a*. Since *<sup>λ</sup>a*(·) is maximal, {*uk*} can be taken to be independent of *a*, see [2]. Now go back to the definition of an epistemic process: We start by choosing *a*, that is, ask an epistemic question: What is the value of *λa*? After the process we get some information; I will here look upon the simple case where we get full knowledge: *λ<sup>a</sup>* = *uk*. I define this as a pure state of the system; it can be characterized by the indicator function **1**(*λa*(*φ*) = *uk*). This is a function in *Ha*, but I will show below that one can find an

is a unique function in *H* = *H*0. Since *H* in this case is a *K*-dimensional vector space, where

see that {|0; *k*�; *k* = 1, ..., *K*} is an orthonormal basis of *H* when *ρ* is normalized to be 1 for the whole space Φ. I will show below that {|*a*; *k*�; *k* = 1, ..., *K*} has the same property. My main point is that |*a*; *k*� is characterized by and characterizes a question: What is *λa*? together with an answer: *λ<sup>a</sup>* = *uk*. This is a pure state for the maximal symmetrical epistemic setting.

*<sup>k</sup>* (*φ*) = *<sup>V</sup>a***1**(*λa*(*φ*) = *uk*) (2)

*<sup>k</sup>* as a *K*-dimensional vector. To conform to the

*<sup>k</sup>* . It is easy to

*f a*

ordinary quantum mechanical notation, I write this as a ket-vector |*a*; *k*� = *f <sup>a</sup>*

groups *ρ* can be normalized, i.e., taken as a probability measure. For each *a* define

accessible. This may be regarded as the quantum hypothesis.

**5. Hilbert space, pure states and operators**

which can serve as a state space for the system.

fix an arbitrary index *a* = 0 ∈ A, and take

invertible operator *V<sup>a</sup>* such that

*K* is the number of values *uk*, we can regard *f <sup>a</sup>*

$$A^a = \sum\_{k=1}^K u\_k |a;k\rangle\langle a;k|\rho$$

where �*a*; *k*| is the bra vector corresponding to |*a*; *k*�. This is by definition the observator corresponding to the e-variable *λa*. Since *λ<sup>a</sup>* is maximal, *A<sup>a</sup>* will have non-degenerate eigenvalues *uk*. Knowing *Aa*, we will have information of all possible values of *λ<sup>a</sup>* together with information about all possible states connected to this variable.

The rest of this section will be devoted to proving (2) and showing the properties of the state vectors |*a*; *k*�. To allow for future generalizations I now allow the accessible e-variables *λ<sup>a</sup>* to take any set of values, continuous or discrete. The discussion will by necessity be a bit technical. First I define the (left) regular representation *U* for a group *G*. For given *f* ∈ *L*2(Φ, *ρ*) and given *g* ∈ *G* we define a new function *U*(*g*)*f* by

$$
\mathcal{U}(\mathfrak{g})f(\mathfrak{\phi}) = f(\mathfrak{g}^{-1}\mathfrak{\phi}).\tag{3}
$$

Without proof I mention 5 properties of the set of operators *U*(*g*):


The concept of *homomorphism* will be crucial in this section. In general, a homomorphism is a mapping *<sup>k</sup>* <sup>→</sup> *<sup>k</sup>*′ between groups *<sup>K</sup>* and *<sup>K</sup>*′ such that *<sup>k</sup>*<sup>1</sup> <sup>→</sup> *<sup>k</sup>*′ <sup>1</sup> and *<sup>k</sup>*<sup>2</sup> <sup>→</sup> *<sup>k</sup>*′ <sup>2</sup> implies *<sup>k</sup>*1*k*<sup>2</sup> <sup>→</sup> *<sup>k</sup>*′ 1*k*′ <sup>2</sup> and such that *<sup>e</sup>* <sup>→</sup> *<sup>e</sup>*′ for the identities. Then also *<sup>k</sup>*−<sup>1</sup> <sup>→</sup> (*k*′ )−<sup>1</sup> when *<sup>k</sup>* → *<sup>k</sup>*′ .

A *representation* of a group *K* is a continuous homomorphism from *K* into a group of invertible operators on some vector space. If the vector space is finite dimensional, the linear operators can be taken as matrices. There is a large and useful mathematical theory about operator (matrix) representations of groups; some of it is sketched in Appendix 3 of [2]. Equation (3) gives one such representation of the basic group *G* on the vector space *L*2(Φ, *ρ*).

#### *Proposition 1.*

*Let Ua* = *U*(*g*0*a*) *with gab defined in the beginning of Section 4. Then*

$$H^a = \mathcal{U}\_a^{-1} H \text{ through } r(\lambda^a(\phi)) = \mathcal{U}\_a^{-1} r(\lambda^0(\phi)).$$

Proof. If *<sup>f</sup>* <sup>∈</sup> *<sup>H</sup>a*, then *<sup>f</sup>*(*φ*) = *<sup>r</sup>*(*λa*(*φ*)) = *<sup>r</sup>*(*λ*0(*g*0*aφ*)) = *<sup>U</sup>*(*g*0*a*)−1*r*(*λ*0(*φ*)) = *<sup>U</sup>*−<sup>1</sup> *<sup>a</sup> <sup>f</sup>*0(*φ*), where *<sup>f</sup>*<sup>0</sup> ∈ *<sup>H</sup>* = *<sup>H</sup>*0.

In this case there is no unique inverse *SO*(3) → *SU*(2), but nevertheless we may say informally that there is a multivalued homomorphism from *SO*(3) to *SU*(2). Here is a way

Extend *SU*(2) to a new group with elements (*g*, *k*), where *g* ∈ *SU*(2) and *k* is an element of the group *K* = {±1} with the natural multiplication. The multiplication in this extended group is defined by (*g*1, *<sup>k</sup>*1) · (*g*2, *<sup>k</sup>*2)=(*g*1*g*2, *<sup>k</sup>*1*k*2), and the inverse by (*g*, *<sup>k</sup>*)−<sup>1</sup> = (*g*<sup>−</sup>1, *<sup>k</sup>*−1).

Proof. (i) Assume as in Theorem 1 that we have a multivalued representation *V* of *G*. Define a larger group *<sup>G</sup>*′ as follows: If *<sup>g</sup>agbgc* = *<sup>g</sup><sup>d</sup> <sup>g</sup>egf* , say, with *<sup>g</sup><sup>k</sup>* ∈ *<sup>G</sup><sup>k</sup>* for all *<sup>k</sup>*, we define

have equality of a limit of such products. Let *<sup>G</sup>*′ be the collection of all such new elements that can be written as a formal product of elements *g<sup>k</sup>* ∈ *G<sup>k</sup>* or as limits of such symbols. The product is defined in the natural way, and the inverse by for example (*gagbgc*)−<sup>1</sup> = (*gc*)−1(*gb*)−1(*ga*)<sup>−</sup>1. By Assumption 2c), the group *<sup>G</sup>*′ generated by this construction must be at least as large as *G*. It is clear from the proof of Theorem 1 that *V* also is a representation

<sup>2</sup> <sup>=</sup> *<sup>g</sup><sup>d</sup> <sup>g</sup>egf* . A similar definition of new group elements is done if we

) = *V*(*g*)*. This mapping*

A Basis for Statistical Theory and Quantum Theory

http://dx.doi.org/10.5772/53702

343

<sup>1</sup> <sup>=</sup> *<sup>g</sup>agbgc* and

<sup>2</sup> <sup>→</sup> *<sup>g</sup>*, and the situation is similar for other

) = *<sup>U</sup>*(*g*0)*. The mapping g*′ → *<sup>g</sup>*<sup>0</sup> *is a*

*, then g*<sup>0</sup> �= *e in G*0*.*

*<sup>c</sup>* = *<sup>U</sup>*(*g*0*agaga*0*g*0*bgbgb*0*g*0*cgcgc*0)

Then there is an invertible homomorphism between this extended group and *SO*(3).

A similar construction can be made with the representation *V* of Theorem 1.

*(ii) There is a unique mapping G*′ → *G, denoted by g*′ → *g, such that V*(*g*′

of the larger group *<sup>G</sup>*′ on *<sup>H</sup>*, now a one-valued representation.

*(i) For g*′ ∈ *<sup>G</sup>*′ *there is a unique g*<sup>0</sup> ∈ *<sup>G</sup>*<sup>0</sup> *such that V*(*g*′

*(ii) If g*′ → *<sup>g</sup>*<sup>0</sup> *by the homomorphism of (i), and g*′ �= *<sup>e</sup>*′ *in G*′

construction, the mapping *<sup>g</sup>*′ → *<sup>g</sup>*<sup>0</sup> is a homomorphism.

*aUbU*(*gb*)*U*†

<sup>2</sup> <sup>=</sup> *<sup>g</sup><sup>d</sup> <sup>g</sup>egf* . There is a natural map *<sup>g</sup>*′

) = *UaU*(*ga*)*U*†

considered as an abstract group.

(ii) Again, if *<sup>g</sup>agbgc* = *<sup>g</sup><sup>d</sup> <sup>g</sup>egf* = *<sup>g</sup>*, say, with *<sup>g</sup><sup>k</sup>* ∈ *<sup>G</sup><sup>k</sup>* for all *<sup>k</sup>*, we define *<sup>g</sup>*′

<sup>1</sup> <sup>→</sup> *<sup>g</sup>* and *<sup>g</sup>*′

products and limits of products. It is easily shown that this mapping is a homomorphism. Note that while *<sup>G</sup>* is a group of transformations on <sup>Φ</sup>, the extended group *<sup>G</sup>*′ must be

Proof. (i) Consider the case where *<sup>g</sup>*′ = *<sup>g</sup>agbgc* with *<sup>g</sup><sup>k</sup>* ∈ *<sup>G</sup>k*. Then by the proof of Theorem

= *U*(*g*0), where *<sup>g</sup>*<sup>0</sup> ∈ *<sup>G</sup>*0. The group element *<sup>g</sup>*<sup>0</sup> is unique since the decomposition *<sup>g</sup>*′ = *<sup>g</sup>agbgc* is

)*U*†

. The proof is similar for other decompositions and limits of these. By the

*bUcU*(*g<sup>c</sup>*

*(i) There is an extended group G*′ *such that V is a univariate representation of G*′ *on H.*

to make this precise:

*Theorem 2.*

*g*′

*g*′

1:

*Theorem 3.*

*homomorphism.*

*<sup>V</sup>*(*g*′

unique for *<sup>g</sup>*′ ∈ *<sup>G</sup>*′

*is a homomorphism.*

<sup>1</sup> <sup>=</sup> *<sup>g</sup>agbgc* and *<sup>g</sup>*′

Since *a* = 0 is a fixed but arbitrary index, this gives in principle a unitary connection between the different choices of *H*, different representations of the 'Hilbert space apparatus'. However this connection cannot be used directly in (2), since if *f <sup>a</sup> <sup>k</sup>* <sup>=</sup> **<sup>1</sup>**(*λ<sup>a</sup>* <sup>=</sup> *uk*) is the state function representing the question: What is *λa*? together with the answer *λ<sup>a</sup>* = *uk*, then we have

$$\mathcal{U}\mathcal{U}\_{0}f\_{k}^{a} = \mathcal{U}(\mathcal{g}\_{0a})\mathbf{1}(\lambda^{0}(\mathcal{g}\_{0a}\phi) = \boldsymbol{\mu}\_{k}) = \mathcal{U}(\mathcal{g}\_{0a})\mathcal{U}(\mathcal{g}\_{0a})^{-1}\mathbf{1}(\lambda^{0}(\phi) = \boldsymbol{\mu}\_{k}) = f\_{k}^{0}.$$

Thus by this simple transformation the indicator functions in *H* are not able to distinguish between the different questions asked.

Another reason why the simple solution is not satisfactory is that the regular representation *U* will not typically be a representation of the whole group *G* on the Hilbert space *H*. This can however be amended by the following theorem. Its proof and the resulting discussion below are where the Assumption c) of the maximal symmetrical epistemic setting is used. Recall that throughout, upper indices (*Ga*, *ga*) are for the subgroups of *G* connected to the accessible variables *λa*, similarly (*G*˜ *<sup>a</sup>*, *g*˜*a*) for the group (elements) acting upon *λa*. Lower indices (e.g., *Ua* = *U*(*g*0*a*)) are related to the transformations between these variables.

#### *Theorem 1.*

*(i) A representation (possibly multivalued) V of the whole group G on H can always be found.*

$$\text{(ii) For } \mathcal{g}^a \in \mathcal{G}^a \text{ we have } V(\mathcal{g}^a) = \mathcal{U}\_a \mathcal{U}(\mathcal{g}^a) \mathcal{U}\_a^\dagger.$$

Proof. (i) For each *<sup>a</sup>* and for *<sup>g</sup><sup>a</sup>* ∈ *<sup>G</sup><sup>a</sup>* define *<sup>V</sup>*(*ga*) = *<sup>U</sup>*(*g*0*a*)*U*(*ga*)*U*(*ga*0). Then *<sup>V</sup>*(*ga*) is an operator on *<sup>H</sup>* = *<sup>H</sup>*0, since it is equal to *<sup>U</sup>*(*g*0*agaga*0), and *<sup>g</sup>*0*agaga*<sup>0</sup> ∈ *<sup>G</sup>*<sup>0</sup> by the construction of *G<sup>a</sup>* from *G*0. For a product *gagbgc* with *g<sup>a</sup>* ∈ *Ga*, *g<sup>b</sup>* ∈ *G<sup>b</sup>* and *g<sup>c</sup>* ∈ *G<sup>c</sup>* we define *V*(*gagbgc*) = *V*(*ga*)*V*(*gb*)*V*(*gc*), and similarly for all elements of *G* that can be written as a finite product of elements from different subgroups.

Let now *g* and *h* be any two elements in *G* such that *g* can be written as a product of elements from *Ga*, *G<sup>b</sup>* and *Gc*, and similarly *h* (the proof is similar for other cases.) It follows that *V*(*gh*) = *V*(*g*)*V*(*h*) on these elements, since the last factor of *g* and the first factor of *h* either must belong to the same subgroup or to different subgroups; in both cases the product can be reduced by the definition of the previous paragraph. In this way we see that *V* is a representation on the set of finite products, and since these generate *G* by Assumption c), and since *U*, hence by definition *V*, is continuous, it is a representation of *G*.

Since different representations of *g* as a product may give different solutions, we have to include the possibility that *V* may be multivalued.

(ii) Directly from the proof of (i).

What is meant by a multivalued representation? As an example, consider the group *SU*(2) of unitary 2 × 2 matrices. Many books in group theory will state that there is a homomorphism from *SU*(2) to the group *SO*(3) of real 3-dimensional rotations, where the kernel of the homomorphism is ±*I*. This latter statement means that both +*I* and −*I* are mapped into the identity rotation by the homomorphism.

In this case there is no unique inverse *SO*(3) → *SU*(2), but nevertheless we may say informally that there is a multivalued homomorphism from *SO*(3) to *SU*(2). Here is a way to make this precise:

Extend *SU*(2) to a new group with elements (*g*, *k*), where *g* ∈ *SU*(2) and *k* is an element of the group *K* = {±1} with the natural multiplication. The multiplication in this extended group is defined by (*g*1, *<sup>k</sup>*1) · (*g*2, *<sup>k</sup>*2)=(*g*1*g*2, *<sup>k</sup>*1*k*2), and the inverse by (*g*, *<sup>k</sup>*)−<sup>1</sup> = (*g*<sup>−</sup>1, *<sup>k</sup>*−1). Then there is an invertible homomorphism between this extended group and *SO*(3).

A similar construction can be made with the representation *V* of Theorem 1.

*Theorem 2.*

8 Quantum Mechanics

where *<sup>f</sup>*<sup>0</sup> ∈ *<sup>H</sup>* = *<sup>H</sup>*0.

*Ua f <sup>a</sup>*

*Theorem 1.*

between the different questions asked.

*(ii) For g<sup>a</sup>* ∈ *G<sup>a</sup> we have V*(*ga*) = *UaU*(*ga*)*U*†

of elements from different subgroups.

(ii) Directly from the proof of (i).

identity rotation by the homomorphism.

include the possibility that *V* may be multivalued.

Proof. If *<sup>f</sup>* <sup>∈</sup> *<sup>H</sup>a*, then *<sup>f</sup>*(*φ*) = *<sup>r</sup>*(*λa*(*φ*)) = *<sup>r</sup>*(*λ*0(*g*0*aφ*)) = *<sup>U</sup>*(*g*0*a*)−1*r*(*λ*0(*φ*)) = *<sup>U</sup>*−<sup>1</sup> *<sup>a</sup> <sup>f</sup>*0(*φ*),

Since *a* = 0 is a fixed but arbitrary index, this gives in principle a unitary connection between the different choices of *H*, different representations of the 'Hilbert space apparatus'. However

*<sup>k</sup>* <sup>=</sup> *<sup>U</sup>*(*g*0*a*)**1**(*λ*0(*g*0*aφ*) = *uk*) = *<sup>U</sup>*(*g*0*a*)*U*(*g*0*a*)−1**1**(*λ*0(*φ*) = *uk*) = *<sup>f</sup>* <sup>0</sup>

Thus by this simple transformation the indicator functions in *H* are not able to distinguish

Another reason why the simple solution is not satisfactory is that the regular representation *U* will not typically be a representation of the whole group *G* on the Hilbert space *H*. This can however be amended by the following theorem. Its proof and the resulting discussion below are where the Assumption c) of the maximal symmetrical epistemic setting is used. Recall that throughout, upper indices (*Ga*, *ga*) are for the subgroups of *G* connected to the accessible variables *λa*, similarly (*G*˜ *<sup>a</sup>*, *g*˜*a*) for the group (elements) acting upon *λa*. Lower indices (e.g., *Ua* = *U*(*g*0*a*)) are related to the transformations between these variables.

*(i) A representation (possibly multivalued) V of the whole group G on H can always be found.*

*a .* Proof. (i) For each *<sup>a</sup>* and for *<sup>g</sup><sup>a</sup>* ∈ *<sup>G</sup><sup>a</sup>* define *<sup>V</sup>*(*ga*) = *<sup>U</sup>*(*g*0*a*)*U*(*ga*)*U*(*ga*0). Then *<sup>V</sup>*(*ga*) is an operator on *<sup>H</sup>* = *<sup>H</sup>*0, since it is equal to *<sup>U</sup>*(*g*0*agaga*0), and *<sup>g</sup>*0*agaga*<sup>0</sup> ∈ *<sup>G</sup>*<sup>0</sup> by the construction of *G<sup>a</sup>* from *G*0. For a product *gagbgc* with *g<sup>a</sup>* ∈ *Ga*, *g<sup>b</sup>* ∈ *G<sup>b</sup>* and *g<sup>c</sup>* ∈ *G<sup>c</sup>* we define *V*(*gagbgc*) = *V*(*ga*)*V*(*gb*)*V*(*gc*), and similarly for all elements of *G* that can be written as a finite product

Let now *g* and *h* be any two elements in *G* such that *g* can be written as a product of elements from *Ga*, *G<sup>b</sup>* and *Gc*, and similarly *h* (the proof is similar for other cases.) It follows that *V*(*gh*) = *V*(*g*)*V*(*h*) on these elements, since the last factor of *g* and the first factor of *h* either must belong to the same subgroup or to different subgroups; in both cases the product can be reduced by the definition of the previous paragraph. In this way we see that *V* is a representation on the set of finite products, and since these generate *G* by Assumption c),

Since different representations of *g* as a product may give different solutions, we have to

What is meant by a multivalued representation? As an example, consider the group *SU*(2) of unitary 2 × 2 matrices. Many books in group theory will state that there is a homomorphism from *SU*(2) to the group *SO*(3) of real 3-dimensional rotations, where the kernel of the homomorphism is ±*I*. This latter statement means that both +*I* and −*I* are mapped into the

and since *U*, hence by definition *V*, is continuous, it is a representation of *G*.

representing the question: What is *λa*? together with the answer *λ<sup>a</sup>* = *uk*, then we have

*<sup>k</sup>* <sup>=</sup> **<sup>1</sup>**(*λ<sup>a</sup>* <sup>=</sup> *uk*) is the state function

*k* .

this connection cannot be used directly in (2), since if *f <sup>a</sup>*

*(i) There is an extended group G*′ *such that V is a univariate representation of G*′ *on H.*

*(ii) There is a unique mapping G*′ → *G, denoted by g*′ → *g, such that V*(*g*′ ) = *V*(*g*)*. This mapping is a homomorphism.*

Proof. (i) Assume as in Theorem 1 that we have a multivalued representation *V* of *G*. Define a larger group *<sup>G</sup>*′ as follows: If *<sup>g</sup>agbgc* = *<sup>g</sup><sup>d</sup> <sup>g</sup>egf* , say, with *<sup>g</sup><sup>k</sup>* ∈ *<sup>G</sup><sup>k</sup>* for all *<sup>k</sup>*, we define *g*′ <sup>1</sup> <sup>=</sup> *<sup>g</sup>agbgc* and *<sup>g</sup>*′ <sup>2</sup> <sup>=</sup> *<sup>g</sup><sup>d</sup> <sup>g</sup>egf* . A similar definition of new group elements is done if we have equality of a limit of such products. Let *<sup>G</sup>*′ be the collection of all such new elements that can be written as a formal product of elements *g<sup>k</sup>* ∈ *G<sup>k</sup>* or as limits of such symbols. The product is defined in the natural way, and the inverse by for example (*gagbgc*)−<sup>1</sup> = (*gc*)−1(*gb*)−1(*ga*)<sup>−</sup>1. By Assumption 2c), the group *<sup>G</sup>*′ generated by this construction must be at least as large as *G*. It is clear from the proof of Theorem 1 that *V* also is a representation of the larger group *<sup>G</sup>*′ on *<sup>H</sup>*, now a one-valued representation.

(ii) Again, if *<sup>g</sup>agbgc* = *<sup>g</sup><sup>d</sup> <sup>g</sup>egf* = *<sup>g</sup>*, say, with *<sup>g</sup><sup>k</sup>* ∈ *<sup>G</sup><sup>k</sup>* for all *<sup>k</sup>*, we define *<sup>g</sup>*′ <sup>1</sup> <sup>=</sup> *<sup>g</sup>agbgc* and *g*′ <sup>2</sup> <sup>=</sup> *<sup>g</sup><sup>d</sup> <sup>g</sup>egf* . There is a natural map *<sup>g</sup>*′ <sup>1</sup> <sup>→</sup> *<sup>g</sup>* and *<sup>g</sup>*′ <sup>2</sup> <sup>→</sup> *<sup>g</sup>*, and the situation is similar for other products and limits of products. It is easily shown that this mapping is a homomorphism.

Note that while *<sup>G</sup>* is a group of transformations on <sup>Φ</sup>, the extended group *<sup>G</sup>*′ must be considered as an abstract group.

*Theorem 3.*

*(i) For g*′ ∈ *<sup>G</sup>*′ *there is a unique g*<sup>0</sup> ∈ *<sup>G</sup>*<sup>0</sup> *such that V*(*g*′ ) = *<sup>U</sup>*(*g*0)*. The mapping g*′ → *<sup>g</sup>*<sup>0</sup> *is a homomorphism.*

*(ii) If g*′ → *<sup>g</sup>*<sup>0</sup> *by the homomorphism of (i), and g*′ �= *<sup>e</sup>*′ *in G*′ *, then g*<sup>0</sup> �= *e in G*0*.*

Proof. (i) Consider the case where *<sup>g</sup>*′ = *<sup>g</sup>agbgc* with *<sup>g</sup><sup>k</sup>* ∈ *<sup>G</sup>k*. Then by the proof of Theorem 1:

$$V(g') = \mathcal{U}\_d \mathcal{U}(g^a) \mathcal{U}\_a^\dagger \mathcal{U}\_b \mathcal{U}(g^b) \mathcal{U}\_b^\dagger \mathcal{U}\_c \mathcal{U}(g^c) \mathcal{U}\_c^\dagger = \mathcal{U}(g\_{0a} g^a g\_{a0} g\_{0b} g^b g\_{b0} g\_{0c} g^c g\_{c0})$$

$$=\mathcal{U}(\mathfrak{g}^0)\_\prime$$

where *<sup>g</sup>*<sup>0</sup> ∈ *<sup>G</sup>*0. The group element *<sup>g</sup>*<sup>0</sup> is unique since the decomposition *<sup>g</sup>*′ = *<sup>g</sup>agbgc* is unique for *<sup>g</sup>*′ ∈ *<sup>G</sup>*′ . The proof is similar for other decompositions and limits of these. By the construction, the mapping *<sup>g</sup>*′ → *<sup>g</sup>*<sup>0</sup> is a homomorphism.

(ii) Assume that *<sup>g</sup>*<sup>0</sup> = *<sup>e</sup>* and *<sup>g</sup>*′ �= *<sup>e</sup>*′ . Since *U*(*g*0) ˜ *f*(*λ*0(*φ*)) = ˜ *<sup>f</sup>*(*λ*0((*g*0)−1(*φ*))), it follows from *<sup>g</sup>*<sup>0</sup> = *<sup>e</sup>* that *<sup>U</sup>*(*g*0) = *<sup>I</sup>* on *<sup>H</sup>*. But then from (i), *<sup>V</sup>*(*g*′ ) = *I*, and since *V* is a univariate representation, it follows that *<sup>g</sup>*′ = *<sup>e</sup>*′ , contrary to the assumption.

statement |*a*; *<sup>i</sup>*� = |*b*; *<sup>j</sup>*� involves, in addition to *<sup>λ</sup><sup>a</sup>* and *<sup>λ</sup>b*, only the two values *ui* and *uj*. By considering a function of the maximally accessible e-variable (compare the next section), we can take one specific value equal to 1, and the others collected in 0. By doing this, we also arrange that both *ui* and *uj* are 1, so we are comparing the state given by *λ<sup>a</sup>* = 1 with the

*<sup>b</sup>* in *<sup>G</sup>*<sup>0</sup> such that *<sup>V</sup>*(*g*′

**<sup>1</sup>**(*λa*(*φ*) = <sup>1</sup>) = *<sup>U</sup>*(*g*0)**1**(*λb*(*φ*) = <sup>1</sup>) = **<sup>1</sup>**(*λb*((*g*0)−1*φ*) = <sup>1</sup>),

Both *λ<sup>a</sup>* and *λ<sup>b</sup>* take only the values 0 and 1. Since the set where *λb*(*φ*) = 1 can be transformed into the set where *λa*(*φ*) = 1, we must have *λ<sup>a</sup>* = *F*(*λb*) for some

Proof. If we had *<sup>G</sup>*′ <sup>=</sup> *<sup>G</sup>*, then <sup>|</sup>*a*; *<sup>k</sup>*� and <sup>|</sup>*b*; *<sup>k</sup>*� both reduce to *Ua***1**(*λa*(*φ*) = *uk*) = *Ub***1**(*λb*(*φ*) =

Theorem 4 and its corollary are also valid in the situation where we are interested in just two accessible variables *λ<sup>a</sup>* and *λb*, which might as well be called *λ*<sup>0</sup> and *λa*. We can then

Proof. Taking the invariant measure *ρ* on *H* as normalized to 1, the indicator functions |0; *k*� = **<sup>1</sup>**(*λ*0(*φ*) = *uk*) form an orthonormal basis for *<sup>H</sup>*. Since the mapping |0; *<sup>k</sup>*�→|*a*; *<sup>k</sup>*� is unitary,

So if *<sup>b</sup>* �<sup>=</sup> *<sup>a</sup>* and *<sup>k</sup>* is fixed, there are complex constants *cki* such that <sup>|</sup>*b*; *<sup>k</sup>*� <sup>=</sup> <sup>∑</sup>*<sup>i</sup> cki*|*a*; *<sup>i</sup>*�. This opens for the interference effects that one sees discussed in quantum mechanical texts. In

*<sup>b</sup>*)*Ub***1**(*λb*(*φ*) = <sup>1</sup>)

*<sup>a</sup>*) = *U*(*g*<sup>0</sup>

*<sup>b</sup>* )*U*(*g*0*b*)**1**(*λb*(*φ*) = <sup>1</sup>);

*, so the representation V of Theorem 1 is really multivalued.*

*<sup>a</sup>* ) and *<sup>V</sup>*(*g*′

A Basis for Statistical Theory and Quantum Theory

http://dx.doi.org/10.5772/53702

<sup>0</sup>*<sup>a</sup>* and all elements *<sup>g</sup>*<sup>0</sup> and *<sup>g</sup>a*. The

<sup>0</sup>*a*<sup>1</sup> and *<sup>g</sup>*′

<sup>0</sup>*a*<sup>2</sup> in *<sup>G</sup>*′

*<sup>b</sup>*) = *<sup>U</sup>*(*g*<sup>0</sup>

*b* ).

345

*<sup>a</sup>*)*Ua***1**(*λa*(*φ*) = <sup>1</sup>) = *<sup>V</sup>*(*g*′

state given by *λ<sup>b</sup>* = 1.

for group elements *<sup>g</sup>*′

for *<sup>g</sup>*<sup>0</sup> = (*g*0*a*)−1(*g*<sup>0</sup>

transformation *F*.

Finally we have

the Theorem follows.

*Theorem 5.*

*Corollary.*

b) follows trivially from a).

*The group G is properly contained in G*′

which are mapped onto *g*0*<sup>a</sup>* follows.

Therefore

Use Theorem 3(i) and find *g*<sup>0</sup>

By the definition, |*a*; 1� = |*b*; 1� can be written

*<sup>V</sup>*(*g*′

*<sup>a</sup>* and *<sup>g</sup>*′

*U*(*g*<sup>0</sup>

*<sup>a</sup>* )−1*g*<sup>0</sup> *<sup>b</sup>g*0*b*.

*<sup>b</sup>* in *<sup>G</sup>*′ .

*<sup>a</sup>* and *g*<sup>0</sup>

*uk*) = **1**(*λ*<sup>0</sup> = *uk*), so Theorem 4 and its proof could not be valid.

earlier statement that it is always possible to find two *different* elements *<sup>g</sup>*′

*For each a* ∈ A*, the vectors* {|*a*; *k*�; *k* = 1, 2, ...} *form an orthonormal basis for H.*

provisionally let the group *<sup>G</sup>* be generated by *<sup>g</sup>*0*a*, *ga*<sup>0</sup> <sup>=</sup> *<sup>g</sup>*−<sup>1</sup>

*<sup>a</sup>* )*U*(*g*0*a*)**1**(*λa*(*φ*) = <sup>1</sup>) = *<sup>U</sup>*(*g*<sup>0</sup>

The theorems 1-3 are valid in any maximal symmetrical epistemic setting. I will now again specialize to the case where the accessible e-variables *λ* have a finite discrete range. This is often done in elementary quantum theory texts, in fact also in recent quantum foundation papers, and in our situation it has several advantages:


So look at the statement *λa*(*φ*) = *uk*. This means two things: 1) One has sought information about the value of the maximally accessible e-variable *λa*, that is, asked the question: What is the value of *λa*? 2) One has obtained the answer *λ<sup>a</sup>* = *uk*. This information can be thought of as a perfect measurement, and it can be represented by the indicator function **1**(*λa*(*φ*) = *uk*), which is a function in *Ha*. From Proposition 1, this function can by a unitary transformation be represented in *H*, which now is a vector space with a discrete basis, a finite-dimensional vector space: *Ua f <sup>a</sup> <sup>k</sup>* . However, we have seen that this tentative state definition *Ua***1**(*λa*(*φ*) = *uk*) = *U*(*g*0*a*)**1**(*λ*0(*g*0*aφ*) = *uk*) led to ambiguities. These ambiguities can be removed by replacing the two *<sup>g</sup>*0*a*'s here in effect by different elements *<sup>g</sup>*′ <sup>0</sup>*ai* of the extended group *<sup>G</sup>*′ . Let *g*′ <sup>0</sup>*a*<sup>1</sup> and *<sup>g</sup>*′ <sup>0</sup>*a*<sup>2</sup> be two different such elements where both *<sup>g</sup>*′ <sup>0</sup>*a*<sup>1</sup> <sup>→</sup> *<sup>g</sup>*0*<sup>a</sup>* and *<sup>g</sup>*′ <sup>0</sup>*a*<sup>2</sup> <sup>→</sup> *<sup>g</sup>*0*<sup>a</sup>* according to Theorem 2 (ii). I will prove in a moment that this is in fact always possible when *<sup>g</sup>*0*<sup>a</sup>* �= *<sup>e</sup>*. Let *<sup>g</sup>*′ *<sup>a</sup>* = (*g*′ 0*a*1)−1*g*′ <sup>0</sup>*a*2, and define

$$f\_k^{a}(\phi) = V(\mathbf{g}\_a') \mathcal{U}\_a \mathbf{1}(\boldsymbol{\lambda}^a(\phi) = \boldsymbol{\mu}\_k) = V(\mathbf{g}\_a') f\_k^0(\phi).$$

This gives the relation (2).

In order that the interpretation of *f <sup>a</sup> <sup>k</sup>* as a state <sup>|</sup>*a*; *<sup>k</sup>*� shall make sense, I need the following result. I assume that *G*˜ <sup>0</sup> is non-trivial.

*Theorem 4.*

*a) Assume that two vectors in H satisfy* |*a*; *<sup>i</sup>*� = |*b*; *<sup>j</sup>*�*, where* |*a*; *<sup>i</sup>*� *corresponds to <sup>λ</sup><sup>a</sup>* = *ui for one perfect measurement and* |*b*; *<sup>j</sup>*� *corresponds to <sup>λ</sup><sup>b</sup>* = *uj for another perfect measurement. Then there is a one-to-one function F such that λ<sup>b</sup>* = *F*(*λa*) *and uj* = *F*(*ui*)*. On the other hand, if λ<sup>b</sup>* = *F*(*λa*) *and uj* = *<sup>F</sup>*(*ui*) *for such a function F, then* |*a*; *<sup>i</sup>*� = |*b*; *<sup>j</sup>*�*.*

*b) Each* |*a*; *<sup>k</sup>*� *corresponds to only one* {*λa*, *uk*} *pair except possibly for a simultaneous one-to-one transformation of this pair.*

Proof. a) I prove the first statement; the second follows from the proof of the first statement. Without loss of generality consider a system where each e-variable *λ* takes only two values, say 0 and 1. Otherwise we can reduce to a degerate system with just these two values: The statement |*a*; *<sup>i</sup>*� = |*b*; *<sup>j</sup>*� involves, in addition to *<sup>λ</sup><sup>a</sup>* and *<sup>λ</sup>b*, only the two values *ui* and *uj*. By considering a function of the maximally accessible e-variable (compare the next section), we can take one specific value equal to 1, and the others collected in 0. By doing this, we also arrange that both *ui* and *uj* are 1, so we are comparing the state given by *λ<sup>a</sup>* = 1 with the state given by *λ<sup>b</sup>* = 1.

By the definition, |*a*; 1� = |*b*; 1� can be written

$$V(\mathcal{g}\_a') \mathcal{U}\_a \mathbf{1}(\boldsymbol{\lambda}^a(\boldsymbol{\phi}) = 1) = V(\mathcal{g}\_b') \mathcal{U}\_b \mathbf{1}(\boldsymbol{\lambda}^b(\boldsymbol{\phi}) = 1)$$

for group elements *<sup>g</sup>*′ *<sup>a</sup>* and *<sup>g</sup>*′ *<sup>b</sup>* in *<sup>G</sup>*′ .

Use Theorem 3(i) and find *g*<sup>0</sup> *<sup>a</sup>* and *g*<sup>0</sup> *<sup>b</sup>* in *<sup>G</sup>*<sup>0</sup> such that *<sup>V</sup>*(*g*′ *<sup>a</sup>*) = *U*(*g*<sup>0</sup> *<sup>a</sup>* ) and *<sup>V</sup>*(*g*′ *<sup>b</sup>*) = *<sup>U</sup>*(*g*<sup>0</sup> *b* ). Therefore

$$\mathcal{U}(\mathcal{g}\_a^0)\mathcal{U}(\mathcal{g}\_{0a})\mathbf{1}(\lambda^a(\phi) = 1) = \mathcal{U}(\mathcal{g}\_b^0)\mathcal{U}(\mathcal{g}\_{0b})\mathbf{1}(\lambda^b(\phi) = 1);$$

$$\mathbf{1}(\lambda^a(\phi) = 1) = \mathcal{U}(\mathbf{g}^0)\mathbf{1}(\lambda^b(\phi) = 1) = \mathbf{1}(\lambda^b((\mathbf{g}^0)^{-1}\phi) = 1),$$

for *<sup>g</sup>*<sup>0</sup> = (*g*0*a*)−1(*g*<sup>0</sup> *<sup>a</sup>* )−1*g*<sup>0</sup> *<sup>b</sup>g*0*b*.

Both *λ<sup>a</sup>* and *λ<sup>b</sup>* take only the values 0 and 1. Since the set where *λb*(*φ*) = 1 can be transformed into the set where *λa*(*φ*) = 1, we must have *λ<sup>a</sup>* = *F*(*λb*) for some transformation *F*.

b) follows trivially from a).

*Corollary.*

10 Quantum Mechanics

vector space: *Ua f <sup>a</sup>*

*g*′

Let *<sup>g</sup>*′

<sup>0</sup>*a*<sup>1</sup> and *<sup>g</sup>*′

*Theorem 4.*

*<sup>a</sup>* = (*g*′

0*a*1)−1*g*′

This gives the relation (2).

*transformation of this pair.*

In order that the interpretation of *f <sup>a</sup>*

result. I assume that *G*˜ <sup>0</sup> is non-trivial.

(ii) Assume that *<sup>g</sup>*<sup>0</sup> = *<sup>e</sup>* and *<sup>g</sup>*′ �= *<sup>e</sup>*′

representation, it follows that *<sup>g</sup>*′ = *<sup>e</sup>*′

from *<sup>g</sup>*<sup>0</sup> = *<sup>e</sup>* that *<sup>U</sup>*(*g*0) = *<sup>I</sup>* on *<sup>H</sup>*. But then from (i), *<sup>V</sup>*(*g*′

papers, and in our situation it has several advantages:

be taken as single points if observations are accurate enough.

replacing the two *<sup>g</sup>*0*a*'s here in effect by different elements *<sup>g</sup>*′

*<sup>k</sup>* (*φ*) = *<sup>V</sup>*(*g*′

<sup>0</sup>*a*2, and define

*and uj* = *<sup>F</sup>*(*ui*) *for such a function F, then* |*a*; *<sup>i</sup>*� = |*b*; *<sup>j</sup>*�*.*

*f a*

<sup>0</sup>*a*<sup>2</sup> be two different such elements where both *<sup>g</sup>*′

• The operators involved will be much simpler and are defined everywhere.

. Since *U*(*g*0) ˜

The theorems 1-3 are valid in any maximal symmetrical epistemic setting. I will now again specialize to the case where the accessible e-variables *λ* have a finite discrete range. This is often done in elementary quantum theory texts, in fact also in recent quantum foundation

• It is easy to interprete the principle that *λ* can be estimated with any fixed accuracy. • In particular, confidence regions and credibility regions for an accessible e-variable can

• The operators *A<sup>a</sup>* can be understood directly from the epistemic setting; see above.

So look at the statement *λa*(*φ*) = *uk*. This means two things: 1) One has sought information about the value of the maximally accessible e-variable *λa*, that is, asked the question: What is the value of *λa*? 2) One has obtained the answer *λ<sup>a</sup>* = *uk*. This information can be thought of as a perfect measurement, and it can be represented by the indicator function **1**(*λa*(*φ*) = *uk*), which is a function in *Ha*. From Proposition 1, this function can by a unitary transformation be represented in *H*, which now is a vector space with a discrete basis, a finite-dimensional

*uk*) = *U*(*g*0*a*)**1**(*λ*0(*g*0*aφ*) = *uk*) led to ambiguities. These ambiguities can be removed by

to Theorem 2 (ii). I will prove in a moment that this is in fact always possible when *<sup>g</sup>*0*<sup>a</sup>* �= *<sup>e</sup>*.

*a) Assume that two vectors in H satisfy* |*a*; *<sup>i</sup>*� = |*b*; *<sup>j</sup>*�*, where* |*a*; *<sup>i</sup>*� *corresponds to <sup>λ</sup><sup>a</sup>* = *ui for one perfect measurement and* |*b*; *<sup>j</sup>*� *corresponds to <sup>λ</sup><sup>b</sup>* = *uj for another perfect measurement. Then there is a one-to-one function F such that λ<sup>b</sup>* = *F*(*λa*) *and uj* = *F*(*ui*)*. On the other hand, if λ<sup>b</sup>* = *F*(*λa*)

*b) Each* |*a*; *<sup>k</sup>*� *corresponds to only one* {*λa*, *uk*} *pair except possibly for a simultaneous one-to-one*

Proof. a) I prove the first statement; the second follows from the proof of the first statement. Without loss of generality consider a system where each e-variable *λ* takes only two values, say 0 and 1. Otherwise we can reduce to a degerate system with just these two values: The

*<sup>a</sup>*)*Ua***1**(*λa*(*φ*) = *uk*) = *<sup>V</sup>*(*g*′

*<sup>k</sup>* . However, we have seen that this tentative state definition *Ua***1**(*λa*(*φ*) =

, contrary to the assumption.

*f*(*λ*0(*φ*)) = ˜

*<sup>f</sup>*(*λ*0((*g*0)−1(*φ*))), it follows

) = *I*, and since *V* is a univariate

<sup>0</sup>*ai* of the extended group *<sup>G</sup>*′

<sup>0</sup>*a*<sup>1</sup> <sup>→</sup> *<sup>g</sup>*0*<sup>a</sup>* and *<sup>g</sup>*′

*<sup>a</sup>*)*<sup>f</sup>* <sup>0</sup> *<sup>k</sup>* (*φ*).

*<sup>k</sup>* as a state <sup>|</sup>*a*; *<sup>k</sup>*� shall make sense, I need the following

. Let

<sup>0</sup>*a*<sup>2</sup> <sup>→</sup> *<sup>g</sup>*0*<sup>a</sup>* according

*The group G is properly contained in G*′ *, so the representation V of Theorem 1 is really multivalued.*

Proof. If we had *<sup>G</sup>*′ <sup>=</sup> *<sup>G</sup>*, then <sup>|</sup>*a*; *<sup>k</sup>*� and <sup>|</sup>*b*; *<sup>k</sup>*� both reduce to *Ua***1**(*λa*(*φ*) = *uk*) = *Ub***1**(*λb*(*φ*) = *uk*) = **1**(*λ*<sup>0</sup> = *uk*), so Theorem 4 and its proof could not be valid.

Theorem 4 and its corollary are also valid in the situation where we are interested in just two accessible variables *λ<sup>a</sup>* and *λb*, which might as well be called *λ*<sup>0</sup> and *λa*. We can then provisionally let the group *<sup>G</sup>* be generated by *<sup>g</sup>*0*a*, *ga*<sup>0</sup> <sup>=</sup> *<sup>g</sup>*−<sup>1</sup> <sup>0</sup>*<sup>a</sup>* and all elements *<sup>g</sup>*<sup>0</sup> and *<sup>g</sup>a*. The earlier statement that it is always possible to find two *different* elements *<sup>g</sup>*′ <sup>0</sup>*a*<sup>1</sup> and *<sup>g</sup>*′ <sup>0</sup>*a*<sup>2</sup> in *<sup>G</sup>*′ which are mapped onto *g*0*<sup>a</sup>* follows.

Finally we have

*Theorem 5.*

*For each a* ∈ A*, the vectors* {|*a*; *k*�; *k* = 1, 2, ...} *form an orthonormal basis for H.*

Proof. Taking the invariant measure *ρ* on *H* as normalized to 1, the indicator functions |0; *k*� = **<sup>1</sup>**(*λ*0(*φ*) = *uk*) form an orthonormal basis for *<sup>H</sup>*. Since the mapping |0; *<sup>k</sup>*�→|*a*; *<sup>k</sup>*� is unitary, the Theorem follows.

So if *<sup>b</sup>* �<sup>=</sup> *<sup>a</sup>* and *<sup>k</sup>* is fixed, there are complex constants *cki* such that <sup>|</sup>*b*; *<sup>k</sup>*� <sup>=</sup> <sup>∑</sup>*<sup>i</sup> cki*|*a*; *<sup>i</sup>*�. This opens for the interference effects that one sees discussed in quantum mechanical texts. In particular <sup>|</sup>*a*; *<sup>k</sup>*� <sup>=</sup> <sup>∑</sup>*<sup>i</sup> dki*|0; *<sup>i</sup>*� for some constants *dki*. This is the first instance of something that we also will meet later in different situations: New states in *H* are found by taking linear combinations of a basic set of state vectors.

As an example of the general construction, assume that *λ<sup>a</sup>* is a vector: *λ<sup>a</sup>* = (*θa*<sup>1</sup> , ..., *θam* ).

A Basis for Statistical Theory and Quantum Theory

http://dx.doi.org/10.5772/53702

347


in an obvious notation, where *a* = (*a*1, ..., *am*) and *k* = (*k*1, ..., *km*). The different *θ*'s may be

So far I have kept the same groups *G<sup>a</sup>* and *G* when going from *λ<sup>a</sup>* to *θ<sup>a</sup>* = *ta*(*λa*), that is from the maximal symmetrical epistemic setting to the general symmetrical epistemic setting. This implies that the (large) Hilbert space will be the same. A special case occurs if *t<sup>a</sup>* is a reduction to an orbit of *Ga*. This is the kind of model reduction mentioned at the end of Section 2. Then the construction of the previous sections can also be carried with a smaller group action acting just upon an orbit, resulting then in a smaller Hilbert space. In the example of the previous paragraph it may be relevant to consider one Hilbert space for each subsystem. The large Hilbert space is however the correct space to use when the whole

Connected to a general physical system, one may have many e-variables *θ* and corresponding operators *A*. In the ordinary quantum formalism, there is well-known theorem saying that, in my formulation, *θ*1, ..., *θ<sup>n</sup>* are compatible, that is, there exists an e-variable *λ* such that

*A<sup>j</sup>* − *A<sup>j</sup>*

(See Holevo [4].) Compatible e-variables may in principle be estimated simultaneously with

The way I have defined pure state, the only state vectors that are allowed, are those which are eigenvectors of some physically meaningful operator. This is hardly a limitation in the spin/angular momentum case where operators corresponding to all directions are included. Nevertheless it is an open question to find general conditions under which all unit vectors in *H* correspond to states |*a*; *k*� the way I have defined them. It is shown in [5] that this holds

Assume now the symmetrical epistemic setting. We can think of a spin component in a fixed direction to be assessed. To assume a state |*a*; *k*� is to assume perfect knowledge of the

about the system, and use these data to obtain knowledge about *θa*. Let us start with Bayesian

*<sup>σ</sup><sup>a</sup>* <sup>=</sup> ∑ *k πa*

*<sup>i</sup>* if and only if the corresponding operators commute:

*A<sup>i</sup>* = 0 for all *i*, *j*.

*<sup>k</sup>*. Such perfect knowledge is rarely available. In practice we have data *<sup>z</sup><sup>a</sup>*

*<sup>k</sup>* (*za*). In either case we summarize this information in the

*<sup>k</sup>*, and after the inference we

*<sup>k</sup>* on the values *<sup>u</sup><sup>a</sup>*

*<sup>k</sup>* <sup>|</sup>*a*; *<sup>k</sup>*��*a*; *<sup>k</sup>*|.

Then one can write a state vector corresponding to *λ<sup>a</sup>* as

connected to different subsystems.

(*λ*) for some functions *t*

[*A<sup>i</sup>* , *A<sup>j</sup>*

under no further conditions for the spin 1/2 case.

inference. This assumes prior probabilities *π<sup>a</sup>*

**7. Link to statistical inference**

have posterior probabilities *π<sup>a</sup>*

] ≡ *A<sup>i</sup>*

system is considered.

arbitrary accuracy.

e-variable *θa*: *θ<sup>a</sup>* = *u<sup>a</sup>*

density operator:

*θ<sup>i</sup>* = *t i*

### **6. The general symmetrical epistemic setting**

Go back to the definition of the maximal symmetrical epistemic setting. Let again *φ* be the inaccessible conceptual variable and let *λ<sup>a</sup>* for *a* ∈ A be the maximal accessible conceptual variables, functions of *φ*. Let the corresponding induced groups *G<sup>a</sup>* and *G* satisfy the assumptions a)-c). Finally, let *t<sup>a</sup>* for each *a* be an arbitrary function on the range of *λa*, and assume that we observe *θ<sup>a</sup>* = *ta*(*λa*); *a* ∈ A. We will call this the symmetrical epistemic setting; it is no longer necessarily maximal with respect to the observations *θa*.

Consider first the quantum states |*a*; *k*�. We are no longer interested in the full information on *λa*, but keep the Hilbert space as in Section 5, and now let *h<sup>a</sup> <sup>k</sup>* (*φ*) = **<sup>1</sup>**(*ta*(*λa*) = *<sup>t</sup>a*(*uk*)) = **1**(*θ<sup>a</sup>* = *u<sup>a</sup> <sup>k</sup>* ), where *<sup>u</sup><sup>a</sup> <sup>k</sup>* <sup>=</sup> *<sup>t</sup>a*(*uk*). We let again *<sup>g</sup>*′ <sup>0</sup>*a*<sup>1</sup> and *<sup>g</sup>*′ <sup>0</sup>*a*<sup>2</sup> be two distinct elements of *<sup>G</sup>*′ such that *<sup>g</sup>*′ <sup>0</sup>*ai* <sup>→</sup> *<sup>g</sup>*0*a*, define *<sup>g</sup>*′ *<sup>a</sup>* = (*g*′ 0*a*1)−1*g*′ <sup>0</sup>*a*<sup>2</sup> and then

$$|a;k\rangle = V(\mathcal{g}\_a') \mathcal{U}\_a h\_k^a = V(\mathcal{g}\_a') |0;k\rangle\_{\prime\prime}$$

where |0; *k*� = *h*<sup>0</sup> *k* .

*Interpretation of the state vector* |*a*; *k*�*:*

*1) The question: 'What is the value of θa?' has been posed. 2) We have obtained the answer θ<sup>a</sup>* = *u<sup>a</sup> k. Both the question and the answer are contained in the state vector.*

From this we may define the operator connected to the e-variable *θa*:

$$A^a = \sum\_k \mu\_k^a |a;k\rangle\langle a;k| = \sum\_k t^a(\mu\_k)|a;k\rangle\langle a;k|.$$

Then *A<sup>a</sup>* is no longer necessarily an operator with distinct eigenvalues, but *A<sup>a</sup>* is still Hermitian: *Aa*† = *Aa*.

*Interpretation of the operator Aa:*

#### *This gives all possible states and all possible values corresponding to the accessible e-variable θa.*

The projectors |*a*; *k*��*a*; *k*| and hence the ket vectors |*a*; *k*� are no longer uniquely determined by *Aa*: They can be transformed arbitrarily by unitary transformations in each space corresponding to one eigenvalue. In general I will redefine |*a*; *k*� by allowing it to be subject to such transformations. These transformed eigenvectors all still correspond to the same eigenvalue, that is, the same observed value of *θ<sup>a</sup>* and they give the same operators *Aa*. In particular, in the maximal symmetric epistemic setting I will allow an arbitrary constant phase factor in the definition of the |*a*; *k*�'s.

As an example of the general construction, assume that *λ<sup>a</sup>* is a vector: *λ<sup>a</sup>* = (*θa*<sup>1</sup> , ..., *θam* ). Then one can write a state vector corresponding to *λ<sup>a</sup>* as

$$|a;k\rangle = |a\_1;k\_1\rangle \otimes \dots \otimes |a\_m;k\_m\rangle$$

in an obvious notation, where *a* = (*a*1, ..., *am*) and *k* = (*k*1, ..., *km*). The different *θ*'s may be connected to different subsystems.

So far I have kept the same groups *G<sup>a</sup>* and *G* when going from *λ<sup>a</sup>* to *θ<sup>a</sup>* = *ta*(*λa*), that is from the maximal symmetrical epistemic setting to the general symmetrical epistemic setting. This implies that the (large) Hilbert space will be the same. A special case occurs if *t<sup>a</sup>* is a reduction to an orbit of *Ga*. This is the kind of model reduction mentioned at the end of Section 2. Then the construction of the previous sections can also be carried with a smaller group action acting just upon an orbit, resulting then in a smaller Hilbert space. In the example of the previous paragraph it may be relevant to consider one Hilbert space for each subsystem. The large Hilbert space is however the correct space to use when the whole system is considered.

Connected to a general physical system, one may have many e-variables *θ* and corresponding operators *A*. In the ordinary quantum formalism, there is well-known theorem saying that, in my formulation, *θ*1, ..., *θ<sup>n</sup>* are compatible, that is, there exists an e-variable *λ* such that *θ<sup>i</sup>* = *t i* (*λ*) for some functions *t <sup>i</sup>* if and only if the corresponding operators commute:

$$[A^i, A^j] \equiv A^i A^j - A^j A^i = 0 \text{ for all } i, j.$$

(See Holevo [4].) Compatible e-variables may in principle be estimated simultaneously with arbitrary accuracy.

The way I have defined pure state, the only state vectors that are allowed, are those which are eigenvectors of some physically meaningful operator. This is hardly a limitation in the spin/angular momentum case where operators corresponding to all directions are included. Nevertheless it is an open question to find general conditions under which all unit vectors in *H* correspond to states |*a*; *k*� the way I have defined them. It is shown in [5] that this holds under no further conditions for the spin 1/2 case.

#### **7. Link to statistical inference**

12 Quantum Mechanics

**1**(*θ<sup>a</sup>* = *u<sup>a</sup>*

where |0; *k*� = *h*<sup>0</sup>

Hermitian: *Aa*† = *Aa*.

*Interpretation of the operator Aa:*

phase factor in the definition of the |*a*; *k*�'s.

that *<sup>g</sup>*′

*<sup>k</sup>* ), where *<sup>u</sup><sup>a</sup>*

<sup>0</sup>*ai* <sup>→</sup> *<sup>g</sup>*0*a*, define *<sup>g</sup>*′

*k* .

*Interpretation of the state vector* |*a*; *k*�*:*

combinations of a basic set of state vectors.

**6. The general symmetrical epistemic setting**

particular <sup>|</sup>*a*; *<sup>k</sup>*� <sup>=</sup> <sup>∑</sup>*<sup>i</sup> dki*|0; *<sup>i</sup>*� for some constants *dki*. This is the first instance of something that we also will meet later in different situations: New states in *H* are found by taking linear

Go back to the definition of the maximal symmetrical epistemic setting. Let again *φ* be the inaccessible conceptual variable and let *λ<sup>a</sup>* for *a* ∈ A be the maximal accessible conceptual variables, functions of *φ*. Let the corresponding induced groups *G<sup>a</sup>* and *G* satisfy the assumptions a)-c). Finally, let *t<sup>a</sup>* for each *a* be an arbitrary function on the range of *λa*, and assume that we observe *θ<sup>a</sup>* = *ta*(*λa*); *a* ∈ A. We will call this the symmetrical epistemic

Consider first the quantum states |*a*; *k*�. We are no longer interested in the full information

<sup>0</sup>*a*<sup>2</sup> and then

*a*)*Uah<sup>a</sup>*

*1) The question: 'What is the value of θa?' has been posed. 2) We have obtained the answer θ<sup>a</sup>* = *u<sup>a</sup>*

*k t*

Then *A<sup>a</sup>* is no longer necessarily an operator with distinct eigenvalues, but *A<sup>a</sup>* is still

*This gives all possible states and all possible values corresponding to the accessible e-variable θa.* The projectors |*a*; *k*��*a*; *k*| and hence the ket vectors |*a*; *k*� are no longer uniquely determined by *Aa*: They can be transformed arbitrarily by unitary transformations in each space corresponding to one eigenvalue. In general I will redefine |*a*; *k*� by allowing it to be subject to such transformations. These transformed eigenvectors all still correspond to the same eigenvalue, that is, the same observed value of *θ<sup>a</sup>* and they give the same operators *Aa*. In particular, in the maximal symmetric epistemic setting I will allow an arbitrary constant

*<sup>k</sup>* <sup>|</sup>*a*; *<sup>k</sup>*��*a*; *<sup>k</sup>*<sup>|</sup> <sup>=</sup> ∑

<sup>0</sup>*a*<sup>1</sup> and *<sup>g</sup>*′

*<sup>k</sup>* <sup>=</sup> *<sup>V</sup>*(*g*′

*<sup>a</sup>*)|0; *<sup>k</sup>*�,

*<sup>a</sup>*(*uk*)|*a*; *<sup>k</sup>*��*a*; *<sup>k</sup>*|.

*<sup>k</sup>* (*φ*) = **<sup>1</sup>**(*ta*(*λa*) = *<sup>t</sup>a*(*uk*)) =

*k.*

<sup>0</sup>*a*<sup>2</sup> be two distinct elements of *<sup>G</sup>*′ such

setting; it is no longer necessarily maximal with respect to the observations *θa*.

on *λa*, but keep the Hilbert space as in Section 5, and now let *h<sup>a</sup>*

*<sup>k</sup>* <sup>=</sup> *<sup>t</sup>a*(*uk*). We let again *<sup>g</sup>*′

0*a*1)−1*g*′

<sup>|</sup>*a*; *<sup>k</sup>*� <sup>=</sup> *<sup>V</sup>*(*g*′

*<sup>a</sup>* = (*g*′

*Both the question and the answer are contained in the state vector.*

*<sup>A</sup><sup>a</sup>* = ∑ *k ua*

From this we may define the operator connected to the e-variable *θa*:

Assume now the symmetrical epistemic setting. We can think of a spin component in a fixed direction to be assessed. To assume a state |*a*; *k*� is to assume perfect knowledge of the e-variable *θa*: *θ<sup>a</sup>* = *u<sup>a</sup> <sup>k</sup>*. Such perfect knowledge is rarely available. In practice we have data *<sup>z</sup><sup>a</sup>* about the system, and use these data to obtain knowledge about *θa*. Let us start with Bayesian inference. This assumes prior probabilities *π<sup>a</sup> <sup>k</sup>* on the values *<sup>u</sup><sup>a</sup> <sup>k</sup>*, and after the inference we have posterior probabilities *π<sup>a</sup> <sup>k</sup>* (*za*). In either case we summarize this information in the density operator:

$$
\sigma^a = \sum\_k \pi\_k^a |a;k\rangle\langle a;k|.
$$

*Interpretation of the density operator σa:*

*1) We have posed the question 'What is the value of θa?' 2) We have specified a prior or posterior probability distribution π<sup>a</sup> <sup>k</sup> over the possible answers. The probability for all possible answers to the question, formulated in terms of state vectors, can be recovered from the density operator.*

*1) We have posed some inference question on the accessible e-variable θa. 2) We have specified the relevant likelihood for the data. The likelihood for all possible answers of the question, formulated in*

Since the focused question assumes discrete data, each likelihood is in the range 0 ≤ *p* ≤ 1. In the quantum mechanical literature, an effect is any operator with eigenvalues in the range

*Consider two potential experiments in the symmetrical epistemic setting with equivalent contexts τ, and assume that the inaccessible conceptual variable φ is the same in both experiments. Suppose*

*constant of proportionality independent of the conceptual variable. Then the questions posed in the two experiments are equivalent, that is, there is an e-variable θ<sup>a</sup> which can be considered to be the same in the two experiments, and the two observations produce the same evidence on θ<sup>a</sup> in this context.*

In many examples the two observations will have equal, not only proportional, likelihood effects. Then the FLP says simply that the experimental evidence is a function of the

In the FLP we have the freedom to redefine the e-variable in the case of coinciding eigenvalues in the likelihood effect, that is, if *<sup>p</sup>*(*za*|*τ*, *<sup>θ</sup><sup>a</sup>* = *uk*) = *<sup>p</sup>*(*za*|*τ*, *<sup>θ</sup><sup>a</sup>* = *ul*) for some *<sup>k</sup>*, *l*. An extreme case is the likelihood effect *E*(*za*, *τ*) = *I*, where all the likelihoods are 1, that is, the probability of *z* is 1 under any considered model. Then any accessible e-variable *θ<sup>a</sup>* will

We are now ready to define the operator-valued measure in this discrete case:

*<sup>M</sup>a*(*B*|*τ*) = ∑

*za*∈*B*

for any Borel set in the sample space for experiment *a*. Its usefulness will be seen after we have discussed Born's formula. Then we will also have background for reading much of [9],

Throughout this section I will consider a fixed context *τ* and a fixed epistemic setting in this context. The inaccessible e-variable is *φ*, and I assume that the accessible e-variables *θ<sup>a</sup>* take a discrete set of values. Let the data behind the potential experiment be *za*, also assumed to

Let first a single experimentalist *A* be in this situation, and let all conceptual variables be attached to *A*, although he also has the possibility to receiving information from others through part of the context *τ*. He has the choice of doing different experiments *a*, and he also has the choice of choosing different models for his experiment through his likelihood *pA*(*za*|*τ*, *<sup>θ</sup>a*). The experiment and the model, hence the likelihood, should be chosen before the data are obtained. All these choices are summarized in the likelihood effect *E*, a function

*E*(*z<sup>a</sup>* , *τ*)

<sup>2</sup> *have proportional likelihood effects in the two experiments, with a*

A Basis for Statistical Theory and Quantum Theory

http://dx.doi.org/10.5772/53702

349

Return now to the likelihood principle of Section 2. The following principle follows.

*terms of state vectors, can be recovered from the likelihood effect.*

<sup>1</sup> *and z*<sup>∗</sup>

a survey over quantum statistical inference.

take a discrete set of values.

**8. Rationality and experimental evidence**

*The focused likelihood principle (FLP)*

*that the observations z*<sup>∗</sup>

likelihood effect.

serve our purpose.

[0, 1].

A third possibility for the probability specifications is a relatively new, but important concept of a confidence distribution ([6], [7]). This is a frequentist alternative to the distribution connected to a parameter (here: e-variable). The idea is that one looks at a one-sided confidence interval for any value of the confidence coefficient *γ*. Let the data be *z*, and let (−∞, *β*(*γ*, *z*)] be such an interval. Then *β*(*γ*) = *β*(*γ*, *z*) is an increasing function. We define *<sup>H</sup>*(·) = *<sup>β</sup>*−1(·) as the confidence distribution for *<sup>θ</sup>*. This *<sup>H</sup>* is a cumulative distribution function, and in the continuous case it is characterized with the property that *H*(*β*(*γ*, *z*)) has a uniform distribution over [0, 1] under the model. For discrete *θ<sup>a</sup>* the confidence distribution function *H<sup>a</sup>* is connected to a discrete distribution, which gives the probabilities *πa <sup>k</sup>* . Extending the argument in [7] to this situation, this should not be looked upon as a distribution *of θa*, but a distribution *for θa*, to be used in the epistemic process.

Since the sum of the probabilities is 1, the trace (sum of eigenvalues) of any density operator is 1. In the quantum mechanical literature, a density operator is any positive operator with trace 1.

Note that specification of the accessible e-variables *θ<sup>a</sup>* is equivalent to specifying *t*(*θa*) for any one-to-one function *t*. The operator *t*(*Aa*) has then distinct eigenvalues if and only if the operator *A<sup>a</sup>* has distinct eigenvalues. Hence it is enough in order to specify the question 1) to give the set of orthonormal vectors |*a*; *k*�.

Given the question *a*, the e-variable *θ<sup>a</sup>* plays the role similar to a parameter in statistical inference, even though it may be connected to a single unit. Inference can be done by preparing many independent units in the same state. Inference is then from data *za*, a part of the total data *z* that nature can provide us with. All inference theory that one finds in standard texts like [8] applies. In particular, the concepts of unbiasedness, equivariance, minimaxity and admissibility apply. None of these concepts are much discussed in the physical literature, first because measurements there are often considered as perfect, at least in elementary texts, secondly because, when measurements are considered in the physical literature, they are discussed in terms of the more abstract concept of an operator-valued measure; see below.

Whatever kind of inference we make on *θa*, we can take as a point of departure the statistical model and the likelihood principle of Section 2. Hence after an experiment is done, and given some context *τ*, all evidence on *θ<sup>a</sup>* is contained in the likelihood *p*(*za*|*τ*, *θa*), where *z<sup>a</sup>* is the portion of the data relevant for inference on *θa*, also assumed discrete. This is summarized in the likelihood effect:

$$E(z^a, \tau) = \sum\_{k} p(z^a | \tau, \theta^a = u\_k^a) | a; k \rangle \langle a; k |.\rangle$$

*Interpretation of the likelihood effect E*(*za*, *τ*)*:*

*1) We have posed some inference question on the accessible e-variable θa. 2) We have specified the relevant likelihood for the data. The likelihood for all possible answers of the question, formulated in terms of state vectors, can be recovered from the likelihood effect.*

Since the focused question assumes discrete data, each likelihood is in the range 0 ≤ *p* ≤ 1. In the quantum mechanical literature, an effect is any operator with eigenvalues in the range [0, 1].

Return now to the likelihood principle of Section 2. The following principle follows.

#### *The focused likelihood principle (FLP)*

14 Quantum Mechanics

*πa*

trace 1.

measure; see below.

in the likelihood effect:

*Interpretation of the density operator σa:*

to give the set of orthonormal vectors |*a*; *k*�.

*E*(*z<sup>a</sup>*

*Interpretation of the likelihood effect E*(*za*, *τ*)*:*

, *<sup>τ</sup>*) = ∑ *k*

*probability distribution π<sup>a</sup>*

*1) We have posed the question 'What is the value of θa?' 2) We have specified a prior or posterior*

A third possibility for the probability specifications is a relatively new, but important concept of a confidence distribution ([6], [7]). This is a frequentist alternative to the distribution connected to a parameter (here: e-variable). The idea is that one looks at a one-sided confidence interval for any value of the confidence coefficient *γ*. Let the data be *z*, and let (−∞, *β*(*γ*, *z*)] be such an interval. Then *β*(*γ*) = *β*(*γ*, *z*) is an increasing function. We define *<sup>H</sup>*(·) = *<sup>β</sup>*−1(·) as the confidence distribution for *<sup>θ</sup>*. This *<sup>H</sup>* is a cumulative distribution function, and in the continuous case it is characterized with the property that *H*(*β*(*γ*, *z*)) has a uniform distribution over [0, 1] under the model. For discrete *θ<sup>a</sup>* the confidence distribution function *H<sup>a</sup>* is connected to a discrete distribution, which gives the probabilities

*<sup>k</sup>* . Extending the argument in [7] to this situation, this should not be looked upon as a

Since the sum of the probabilities is 1, the trace (sum of eigenvalues) of any density operator is 1. In the quantum mechanical literature, a density operator is any positive operator with

Note that specification of the accessible e-variables *θ<sup>a</sup>* is equivalent to specifying *t*(*θa*) for any one-to-one function *t*. The operator *t*(*Aa*) has then distinct eigenvalues if and only if the operator *A<sup>a</sup>* has distinct eigenvalues. Hence it is enough in order to specify the question 1)

Given the question *a*, the e-variable *θ<sup>a</sup>* plays the role similar to a parameter in statistical inference, even though it may be connected to a single unit. Inference can be done by preparing many independent units in the same state. Inference is then from data *za*, a part of the total data *z* that nature can provide us with. All inference theory that one finds in standard texts like [8] applies. In particular, the concepts of unbiasedness, equivariance, minimaxity and admissibility apply. None of these concepts are much discussed in the physical literature, first because measurements there are often considered as perfect, at least in elementary texts, secondly because, when measurements are considered in the physical literature, they are discussed in terms of the more abstract concept of an operator-valued

Whatever kind of inference we make on *θa*, we can take as a point of departure the statistical model and the likelihood principle of Section 2. Hence after an experiment is done, and given some context *τ*, all evidence on *θ<sup>a</sup>* is contained in the likelihood *p*(*za*|*τ*, *θa*), where *z<sup>a</sup>* is the portion of the data relevant for inference on *θa*, also assumed discrete. This is summarized

*p*(*za*|*τ*, *θ<sup>a</sup>* = *u<sup>a</sup>*

*<sup>k</sup>* )|*a*; *<sup>k</sup>*��*a*; *<sup>k</sup>*|.

*question, formulated in terms of state vectors, can be recovered from the density operator.*

distribution *of θa*, but a distribution *for θa*, to be used in the epistemic process.

*<sup>k</sup> over the possible answers. The probability for all possible answers to the*

*Consider two potential experiments in the symmetrical epistemic setting with equivalent contexts τ, and assume that the inaccessible conceptual variable φ is the same in both experiments. Suppose that the observations z*<sup>∗</sup> <sup>1</sup> *and z*<sup>∗</sup> <sup>2</sup> *have proportional likelihood effects in the two experiments, with a constant of proportionality independent of the conceptual variable. Then the questions posed in the two experiments are equivalent, that is, there is an e-variable θ<sup>a</sup> which can be considered to be the same in the two experiments, and the two observations produce the same evidence on θ<sup>a</sup> in this context.*

In many examples the two observations will have equal, not only proportional, likelihood effects. Then the FLP says simply that the experimental evidence is a function of the likelihood effect.

In the FLP we have the freedom to redefine the e-variable in the case of coinciding eigenvalues in the likelihood effect, that is, if *<sup>p</sup>*(*za*|*τ*, *<sup>θ</sup><sup>a</sup>* = *uk*) = *<sup>p</sup>*(*za*|*τ*, *<sup>θ</sup><sup>a</sup>* = *ul*) for some *<sup>k</sup>*, *l*. An extreme case is the likelihood effect *E*(*za*, *τ*) = *I*, where all the likelihoods are 1, that is, the probability of *z* is 1 under any considered model. Then any accessible e-variable *θ<sup>a</sup>* will serve our purpose.

We are now ready to define the operator-valued measure in this discrete case:

$$M^a(B|\tau) = \sum\_{z^a \in B} E(z^a, \tau)$$

for any Borel set in the sample space for experiment *a*. Its usefulness will be seen after we have discussed Born's formula. Then we will also have background for reading much of [9], a survey over quantum statistical inference.

### **8. Rationality and experimental evidence**

Throughout this section I will consider a fixed context *τ* and a fixed epistemic setting in this context. The inaccessible e-variable is *φ*, and I assume that the accessible e-variables *θ<sup>a</sup>* take a discrete set of values. Let the data behind the potential experiment be *za*, also assumed to take a discrete set of values.

Let first a single experimentalist *A* be in this situation, and let all conceptual variables be attached to *A*, although he also has the possibility to receiving information from others through part of the context *τ*. He has the choice of doing different experiments *a*, and he also has the choice of choosing different models for his experiment through his likelihood *pA*(*za*|*τ*, *<sup>θ</sup>a*). The experiment and the model, hence the likelihood, should be chosen before the data are obtained. All these choices are summarized in the likelihood effect *E*, a function of the at present unknown data *za*. For use after the experiment, he should also choose a good estimator/predictor *θ <sup>a</sup>*, and he may also have to choose some loss function, but the principles behind these latter choices will be considered as part of the context *τ*. If he chooses to do a Bayesian analysis, the estimator should be based on a prior *π*(*θa*|*τ*). We assume that *A* is trying to be as rational as possible in all his choices, and that this rationality is connected to his loss function or to other criteria.

perfectly rational. We can try to be as rational as possible, but we have to rely on some

A Basis for Statistical Theory and Quantum Theory

http://dx.doi.org/10.5772/53702

351

So let the hypothetical odds of a given bet for *D* be (1 − *q*)/*q* to 1, where *q* is the probability as defined by (4). This odds specification is a way to make precise that, given the context *τ* and given the question *a*, the bettor's probability that the experimental result takes some value is given by *q*: For a given utility measured by *x*, the bettor *D* pays in an amount *qx* the stake - to the bookie. After the experiment the bookie pays out an amount *x* - the payoff - to the bettor if the result of the experiment takes the value *za*, otherwise nothing is payed.

*Consider in some context τ a maximal symmetrical epistemic setting where the FLP is satisfied, and the whole situation is observed and acted upon by a superior actor D as described above. Assume that D's probabilities q given by (4) are taken as the experimental evidence, and that D acts rationally in*

A situation where all the Assumption D holds together with the assumptions of a symmetric

*Assume a rational epistemic setting. Let E*<sup>1</sup> *and E*<sup>2</sup> *be two likelihood effects in this setting, and assume that E*<sup>1</sup> + *E*<sup>2</sup> *also is a likelihood effect. Then the experimental evidences, taken as the probabilities of*

*<sup>q</sup>*(*E*<sup>1</sup> + *<sup>E</sup>*2|*τ*) = *<sup>q</sup>*(*E*1|*τ*) + *<sup>q</sup>*(*E*2|*τ*).

Proof. The result of the theorem is obvious, without making Assumption D, if *E*<sup>1</sup> and *E*<sup>2</sup> are likelihood effects connected to experiments on the same e-variable *θa*. We will prove it in general. Consider then any finite number of potential experiments including the two with likelihood effects *<sup>E</sup>*<sup>1</sup> and *<sup>E</sup>*2. Let *<sup>q</sup>*<sup>1</sup> = *<sup>q</sup>*(*E*1|*τ*) be equal to (4) for the first experiment, and let *<sup>q</sup>*<sup>2</sup> = *<sup>q</sup>*(*E*2|*τ*) be equal to the same quantity for the second experiment. Consider in addition the following randomized experiment: Throw an unbiased coin. If head, choose the experiment with likelihood effect *E*1; if tail, choose the experiment with likelihood effect *E*2. This is a valid experiment. The likelihood effect when the coin shows head is <sup>1</sup>

Define *q*<sup>0</sup> = *q*(*E*0). Let the bettor bet on the results of all these 3 experiments: Payoff *x*<sup>1</sup> for

I will divide into 3 possible outcomes: Either the likelihood effect from the data *z* is *E*<sup>1</sup> or it is *E*<sup>2</sup> or it is none of these. The randomization in the choice of *E*<sup>0</sup> is considered separately from the result of the bet. (Technically this can be done by repeating the whole series of

experiment 1, payoff *x*<sup>2</sup> for experiment 2 and payoff *x*<sup>0</sup> for experiment 0.

<sup>2</sup> *<sup>E</sup>*2, so that the likelihood effect of this experiment is *<sup>E</sup>*<sup>0</sup> <sup>=</sup> <sup>1</sup>

<sup>2</sup> *E*1,

<sup>2</sup> (*E*<sup>1</sup> + *E*2).

underlying rational principles that partly determine our actions.

*No choice of payoffs in a series of bets shall lead to a sure loss for the bettor.*

The rationality of *D* is formulated in terms of

For a related use of the same principle, see [11].

epistemic setting will be called a *rational epistemic setting*.

*agreement with the Dutch book principle.*

*the corresponding data, satisfy*

when it shows tail <sup>1</sup>

*The Dutch book principle.*

*Assumption D.*

*Theorem 6.*

What should be meant by experimental evidence, and how should it be measured? As a natural choice, let the experimental evidence that we are seeking, be the marginal probability of the obtained data for a fixed experiment and for a given likelihood function. From the experimentalist *A*'s point of view this is given by:

$$p\_A^a(z^a|\tau) = \sum\_k p\_A(z^a|\tau, \theta^a = u\_k) \pi\_A(\theta^a = u\_k|\tau)\_\prime$$

assuming the likelihood chosen by *A* and *A*'s prior *π<sup>A</sup>* for *θa*. In a non-Bayesian analysis, we can let *p<sup>a</sup> <sup>A</sup>*(*za*|*τ*) be the probability given the true value *<sup>u</sup>*<sup>0</sup> *<sup>k</sup>* of the e-variable: *<sup>p</sup><sup>a</sup> <sup>A</sup>*(*za*|*τ*) = *pA*(*za*|*τ*, *θ<sup>a</sup>* = *u*<sup>0</sup> *<sup>k</sup>* ). In general, take *<sup>p</sup><sup>a</sup> <sup>A</sup>*(*za*|*τ*) as the probability of the part of the data *<sup>z</sup><sup>a</sup>* which *A* assesses in connection to his inference on *θa*. By the FLP - specialized to the case of one experiment and equal likelihoods - this experimental evidence must be a function of the likelihood effect: *p<sup>a</sup> <sup>A</sup>*(*za*|*τ*) = *qA*(*E*(*za*)|*τ*).

We have to make precise in some way what is meant by the rationality of the experimentalist *A*. He has to make many difficult choices on the basis of uncertain knowledge. His actions can partly be based on intuition, partly on experience from similar situations, partly on a common scientific culture and partly on advices from other persons. These other persons will in turn have their intuition, their experience and their scientific education. Often *A* will have certain explicitly formulated principles on which to base his decisions, but sometimes he has to dispense with the principles. In the latter case, he has to rely on some 'inner voice', a conviction which tells him what to do.

We will formalize all this by introducing a perfectly rational superior actor *D*, to which all these principles, experiences and convictions can be related. We also assume that *D* can observe everything that is going on, in particular *A*, and that he on this background can have some influence on *A*'s decisions. The real experimental evidence will then be defined as *the probability of the data z<sup>a</sup> from D's point of view, which we assume also to give the real objective probabilities*. By the FLP this must again be a function of the likelihood effect *E*, where the likelihood now may be seen as the objectively correct model.

$$p^a(z^a|\tau) = q(E(z^a)|\tau) \tag{4}$$

As said, we assume that *D* is perfectly rational. This can be formalized mathematically by considering a hypothetical betting situation for *D* against a bookie, nature *N*. A similar discussion was recently done in [10] using a more abstract language. Note the difference to the ordinary Bayesian assumption, where *A* himself is assumed to be perfectly rational. This difference is crucial to me. I do not see any human scientist, including myself, as being perfectly rational. We can try to be as rational as possible, but we have to rely on some underlying rational principles that partly determine our actions.

So let the hypothetical odds of a given bet for *D* be (1 − *q*)/*q* to 1, where *q* is the probability as defined by (4). This odds specification is a way to make precise that, given the context *τ* and given the question *a*, the bettor's probability that the experimental result takes some value is given by *q*: For a given utility measured by *x*, the bettor *D* pays in an amount *qx* the stake - to the bookie. After the experiment the bookie pays out an amount *x* - the payoff - to the bettor if the result of the experiment takes the value *za*, otherwise nothing is payed.

The rationality of *D* is formulated in terms of

*The Dutch book principle.*

*No choice of payoffs in a series of bets shall lead to a sure loss for the bettor.*

For a related use of the same principle, see [11].

*Assumption D.*

16 Quantum Mechanics

we can let *p<sup>a</sup>*

*pA*(*za*|*τ*, *θ<sup>a</sup>* = *u*<sup>0</sup>

likelihood effect: *p<sup>a</sup>*

a good estimator/predictor *θ*

to his loss function or to other criteria.

experimentalist *A*'s point of view this is given by:

*pa*

a conviction which tells him what to do.

*<sup>A</sup>*(*za*|*τ*) = ∑

*<sup>k</sup>* ). In general, take *<sup>p</sup><sup>a</sup>*

*<sup>A</sup>*(*za*|*τ*) = *qA*(*E*(*za*)|*τ*).

likelihood now may be seen as the objectively correct model.

*k*

*<sup>A</sup>*(*za*|*τ*) be the probability given the true value *<sup>u</sup>*<sup>0</sup>

of the at present unknown data *za*. For use after the experiment, he should also choose

principles behind these latter choices will be considered as part of the context *τ*. If he chooses to do a Bayesian analysis, the estimator should be based on a prior *π*(*θa*|*τ*). We assume that *A* is trying to be as rational as possible in all his choices, and that this rationality is connected

What should be meant by experimental evidence, and how should it be measured? As a natural choice, let the experimental evidence that we are seeking, be the marginal probability of the obtained data for a fixed experiment and for a given likelihood function. From the

assuming the likelihood chosen by *A* and *A*'s prior *π<sup>A</sup>* for *θa*. In a non-Bayesian analysis,

which *A* assesses in connection to his inference on *θa*. By the FLP - specialized to the case of one experiment and equal likelihoods - this experimental evidence must be a function of the

We have to make precise in some way what is meant by the rationality of the experimentalist *A*. He has to make many difficult choices on the basis of uncertain knowledge. His actions can partly be based on intuition, partly on experience from similar situations, partly on a common scientific culture and partly on advices from other persons. These other persons will in turn have their intuition, their experience and their scientific education. Often *A* will have certain explicitly formulated principles on which to base his decisions, but sometimes he has to dispense with the principles. In the latter case, he has to rely on some 'inner voice',

We will formalize all this by introducing a perfectly rational superior actor *D*, to which all these principles, experiences and convictions can be related. We also assume that *D* can observe everything that is going on, in particular *A*, and that he on this background can have some influence on *A*'s decisions. The real experimental evidence will then be defined as *the probability of the data z<sup>a</sup> from D's point of view, which we assume also to give the real objective probabilities*. By the FLP this must again be a function of the likelihood effect *E*, where the

As said, we assume that *D* is perfectly rational. This can be formalized mathematically by considering a hypothetical betting situation for *D* against a bookie, nature *N*. A similar discussion was recently done in [10] using a more abstract language. Note the difference to the ordinary Bayesian assumption, where *A* himself is assumed to be perfectly rational. This difference is crucial to me. I do not see any human scientist, including myself, as being

*pA*(*za*|*τ*, *θ<sup>a</sup>* = *uk*)*πA*(*θ<sup>a</sup>* = *uk*|*τ*),

*<sup>a</sup>*, and he may also have to choose some loss function, but the

*<sup>k</sup>* of the e-variable: *<sup>p</sup><sup>a</sup>*

*<sup>A</sup>*(*za*|*τ*) as the probability of the part of the data *<sup>z</sup><sup>a</sup>*

*pa*(*za*|*τ*) = *q*(*E*(*za*)|*τ*) (4)

*<sup>A</sup>*(*za*|*τ*) =

*Consider in some context τ a maximal symmetrical epistemic setting where the FLP is satisfied, and the whole situation is observed and acted upon by a superior actor D as described above. Assume that D's probabilities q given by (4) are taken as the experimental evidence, and that D acts rationally in agreement with the Dutch book principle.*

A situation where all the Assumption D holds together with the assumptions of a symmetric epistemic setting will be called a *rational epistemic setting*.

*Theorem 6.*

*Assume a rational epistemic setting. Let E*<sup>1</sup> *and E*<sup>2</sup> *be two likelihood effects in this setting, and assume that E*<sup>1</sup> + *E*<sup>2</sup> *also is a likelihood effect. Then the experimental evidences, taken as the probabilities of the corresponding data, satisfy*

$$q(E\_1 + E\_2 | \tau) = q(E\_1 | \tau) + q(E\_2 | \tau).$$

Proof. The result of the theorem is obvious, without making Assumption D, if *E*<sup>1</sup> and *E*<sup>2</sup> are likelihood effects connected to experiments on the same e-variable *θa*. We will prove it in general. Consider then any finite number of potential experiments including the two with likelihood effects *<sup>E</sup>*<sup>1</sup> and *<sup>E</sup>*2. Let *<sup>q</sup>*<sup>1</sup> = *<sup>q</sup>*(*E*1|*τ*) be equal to (4) for the first experiment, and let *<sup>q</sup>*<sup>2</sup> = *<sup>q</sup>*(*E*2|*τ*) be equal to the same quantity for the second experiment. Consider in addition the following randomized experiment: Throw an unbiased coin. If head, choose the experiment with likelihood effect *E*1; if tail, choose the experiment with likelihood effect *E*2. This is a valid experiment. The likelihood effect when the coin shows head is <sup>1</sup> <sup>2</sup> *E*1, when it shows tail <sup>1</sup> <sup>2</sup> *<sup>E</sup>*2, so that the likelihood effect of this experiment is *<sup>E</sup>*<sup>0</sup> <sup>=</sup> <sup>1</sup> <sup>2</sup> (*E*<sup>1</sup> + *E*2). Define *q*<sup>0</sup> = *q*(*E*0). Let the bettor bet on the results of all these 3 experiments: Payoff *x*<sup>1</sup> for experiment 1, payoff *x*<sup>2</sup> for experiment 2 and payoff *x*<sup>0</sup> for experiment 0.

I will divide into 3 possible outcomes: Either the likelihood effect from the data *z* is *E*<sup>1</sup> or it is *E*<sup>2</sup> or it is none of these. The randomization in the choice of *E*<sup>0</sup> is considered separately from the result of the bet. (Technically this can be done by repeating the whole series of experiments many times with the same randomization. This is also consistent with the conditionality principle.) Thus if *E*<sup>1</sup> occurs, the payoff for experiment 0 is replaced by the expected payoff *x*0/2, similarly if *E*<sup>2</sup> occurs. The net expected amount the bettor receives is then

**9. The Born formula**

**9.1. The basic formula**

upon the result of Section 8.

on the effects with the properties

(2) *µ*(*I*) = 1,

*Theorem 7. (Busch, 2003).*

*θ<sup>b</sup>* = *u<sup>b</sup>*

Born's formula is the basis for all probability calculations in quantum mechanics. In textbooks it is usually stated as a separate axiom, but it has also been argued for by using various sets of assumptions; see [12] for some references. Here I will base the discussion

I begin with a recent result by Busch [13], giving a new version of a classical mathematical theorem by Gleason. Busch's version has the advantage that it is valid for a Hilbert space of dimension 2, which Gleason's original theorem is not, and it also has a simpler proof. For a

Let in general *H* be any Hilbert space. Recall that an effect *E* is any operator on the Hilbert space with eigenvalues in the range [0, 1]. A generalized probability measure *µ* is a function

(3) *<sup>µ</sup>*(*E*<sup>1</sup> + *<sup>E</sup>*<sup>2</sup> + ...) = *<sup>µ</sup>*(*E*1) + *<sup>µ</sup>*(*E*2) + ... whenever *<sup>E</sup>*<sup>1</sup> + *<sup>E</sup>*<sup>2</sup> + ... ≤ *<sup>I</sup>*.

*Any generalized probability measure µ is of the form µ*(*E*) = Tr(*σE*) *for some density operator σ.* It is now easy to see that *q*(*E*|*τ*) = *p*(*z*|*τ*) on the ideal likelihood effects of Section 8 is a generalized probability measure if Assumption D holds: (1) follows since *q* is a probability; (2) since *E* = *I* implies that the likelihood is 1 for all values of the e-variable, hence *p*(*z*) = 1; finally (3) is a concequence of the corollary of Theorem 6. Hence there is a density operator

Define now a *perfect experiment* as one where the measurement uncertainty can be disregarded. The quantum mechanical literature operates very much with perfect experiments which give well-defined states |*k*�. From the point of view of statistics, if, say the

In our symmetric epistemic setting then: We have asked the question: 'What is the value of the accessible e-variable *θb*?', and are interested in finding the probability of the answer

that this probability is sought in a context *τ* = *τa*,*<sup>k</sup>* defined as follows: We have previous

That is, we know the state |*a*; *k*�. If *θ<sup>a</sup>* is maximally accessible, this is the maximal knowledge about the system that *τ* may contain; in general we assume that the context *τ* does not contain

more information about this system. It can contain irrelevant information, however.

*<sup>j</sup>* though a perfect experiment. This is the probability of the state <sup>|</sup>*b*; *<sup>j</sup>*�. Assume now

*k*.

*<sup>k</sup>* of another accessible question: What is the value of *<sup>θ</sup>a*?

*<sup>k</sup>*, we can infer approximately

A Basis for Statistical Theory and Quantum Theory

http://dx.doi.org/10.5772/53702

353

*σ* = *σ*(*τ*) such that *p*(*z*|*τ*) = Tr(*σ*(*τ*)*E*) for all ideal likelihood effects *E* = *E*(*z*).

99% confidence or credibility region of *θ<sup>b</sup>* is the single point *u<sup>b</sup>*

that a perfect experiment has given the result *θ<sup>b</sup>* = *u<sup>b</sup>*

knowledge of the answer *θ<sup>a</sup>* = *u<sup>a</sup>*

proof for the finite-dimensional case, see Appendix 5 of [2].

(1) 0 ≤ *µ*(*E*) ≤ 1 for all *E*,

$$\mathbf{x}\_1 + \frac{1}{2}\mathbf{x}\_0 - q\_1\mathbf{x}\_1 - q\_2\mathbf{x}\_2 - q\_0\mathbf{x}\_0 = (1 - q\_1)\mathbf{x}\_1 - q\_2\mathbf{x}\_2 - (1 - 2q\_0)\frac{1}{2}\mathbf{x}\_0 \text{ if } E\_1.$$

$$\mathbf{x}\_2 + \frac{1}{2}\mathbf{x}\_0 - q\_1\mathbf{x}\_1 - q\_2\mathbf{x}\_2 - q\_0\mathbf{x}\_0 = -q\_1\mathbf{x}\_1 - (1 - q\_2)\mathbf{x}\_2 - (1 - 2q\_0)\frac{1}{2}\mathbf{x}\_0 \text{ if } E\_2.$$

$$-q\_1\mathbf{x}\_1 - q\_2\mathbf{x}\_2 - 2q\_0 \cdot \frac{1}{2}\mathbf{x}\_0 \text{ otherwise.}$$

The payoffs (*x*1, *x*2, *x*0) can be chosen by nature *N* in such a way that it leads to sure loss for the bettor *D* if not the determinant of this system is zero:

$$0 = \begin{vmatrix} 1 - q\_1 & -q\_2 & 1 - 2q\_0 \\ -q\_1 & 1 - q\_2 & 1 - 2q\_0 \\ -q\_1 & -q\_2 & -2q\_0 \end{vmatrix} = q\_1 + q\_2 - 2q\_0.$$

Thus we must have

$$q(\frac{1}{2}(E\_1 + E\_2)|\tau) = \frac{1}{2}(q(E\_1|\tau) + q(E\_2|\tau)).$$

If *E*<sup>1</sup> + *E*<sup>2</sup> is an effect, the common factor <sup>1</sup> <sup>2</sup> can be removed by changing the likelihoods, and the result follows.

#### *Corollary.*

*Assume a rational epistemic setting. Let E*1*, E*2*, ... be likelihood effects in this setting, and assume that E*<sup>1</sup> + *E*<sup>2</sup> + ... *also is a likelihood effect. Then*

$$q(E\_1 + E\_2 + \ldots | \tau) = q(E\_1 | \tau) + q(E\_2 | \tau) + \ldots$$

Proof. The finite case follows immediately from Theorem 6. Then the infinite case follows from monotone convergence.

The result of this section is quite general. In particular the loss function and any other criterion for the success of the experiments are arbitrary. So far I have assumed that the choice of experiment *a* is fixed, which implies that it is the same for *A* and for *D*. However, the result also applies to the following more general situation: Let *A* have some definite purpose of his experiment, and to achieve that purpose, he has to choose the question *a* in a clever manner, as rationally as he can. Assume that this rationality is formalized through the actor *D*, who has the ideal likelihood effect *E* and the experimental evidence *p*(*z*|*τ*) = *q*(*E*|*τ*). If two such questions shall be chosen, the result of Theorem 6 holds, with essentially the same proof.

### **9. The Born formula**

18 Quantum Mechanics

*x*<sup>1</sup> + 1 2

*x*<sup>2</sup> + 1 2

Thus we must have

the result follows.

*Corollary.*

proof.

then

experiments many times with the same randomization. This is also consistent with the conditionality principle.) Thus if *E*<sup>1</sup> occurs, the payoff for experiment 0 is replaced by the expected payoff *x*0/2, similarly if *E*<sup>2</sup> occurs. The net expected amount the bettor receives is

*<sup>x</sup>*<sup>0</sup> − *<sup>q</sup>*1*x*<sup>1</sup> − *<sup>q</sup>*2*x*<sup>2</sup> − *<sup>q</sup>*0*x*<sup>0</sup> = (<sup>1</sup> − *<sup>q</sup>*1)*x*<sup>1</sup> − *<sup>q</sup>*2*x*<sup>2</sup> − (<sup>1</sup> − <sup>2</sup>*q*0)

*<sup>x</sup>*<sup>0</sup> − *<sup>q</sup>*1*x*<sup>1</sup> − *<sup>q</sup>*2*x*<sup>2</sup> − *<sup>q</sup>*0*x*<sup>0</sup> = −*q*1*x*<sup>1</sup> − (<sup>1</sup> − *<sup>q</sup>*2)*x*<sup>2</sup> − (<sup>1</sup> − <sup>2</sup>*q*0)

The payoffs (*x*1, *x*2, *x*0) can be chosen by nature *N* in such a way that it leads to sure loss for

2

*Assume a rational epistemic setting. Let E*1*, E*2*, ... be likelihood effects in this setting, and assume*

*<sup>q</sup>*(*E*<sup>1</sup> + *<sup>E</sup>*<sup>2</sup> + ...|*τ*) = *<sup>q</sup>*(*E*1|*τ*) + *<sup>q</sup>*(*E*2|*τ*) + ....

Proof. The finite case follows immediately from Theorem 6. Then the infinite case follows

The result of this section is quite general. In particular the loss function and any other criterion for the success of the experiments are arbitrary. So far I have assumed that the choice of experiment *a* is fixed, which implies that it is the same for *A* and for *D*. However, the result also applies to the following more general situation: Let *A* have some definite purpose of his experiment, and to achieve that purpose, he has to choose the question *a* in a clever manner, as rationally as he can. Assume that this rationality is formalized through the actor *D*, who has the ideal likelihood effect *E* and the experimental evidence *p*(*z*|*τ*) = *q*(*E*|*τ*). If two such questions shall be chosen, the result of Theorem 6 holds, with essentially the same

1 2

> 

*x*<sup>0</sup> otherwise.

= *<sup>q</sup>*<sup>1</sup> + *<sup>q</sup>*<sup>2</sup> − <sup>2</sup>*q*0.

<sup>2</sup> can be removed by changing the likelihoods, and

(*q*(*E*1|*τ*) + *<sup>q</sup>*(*E*2|*τ*)).

−*q*1*x*<sup>1</sup> − *<sup>q</sup>*2*x*<sup>2</sup> − <sup>2</sup>*q*<sup>0</sup> ·

<sup>1</sup> − *<sup>q</sup>*<sup>1</sup> −*q*<sup>2</sup> <sup>1</sup> − <sup>2</sup>*q*<sup>0</sup> −*q*<sup>1</sup> <sup>1</sup> − *<sup>q</sup>*<sup>2</sup> <sup>1</sup> − <sup>2</sup>*q*<sup>0</sup> −*q*<sup>1</sup> −*q*<sup>2</sup> −2*q*<sup>0</sup>

(*E*<sup>1</sup> <sup>+</sup> *<sup>E</sup>*2)|*τ*) = <sup>1</sup>

the bettor *D* if not the determinant of this system is zero:

0 = 

> *q*( 1 2

If *E*<sup>1</sup> + *E*<sup>2</sup> is an effect, the common factor <sup>1</sup>

*that E*<sup>1</sup> + *E*<sup>2</sup> + ... *also is a likelihood effect. Then*

from monotone convergence.

1 2

> 1 2

*x*<sup>0</sup> if *E*1,

*x*<sup>0</sup> if *E*2,

#### **9.1. The basic formula**

Born's formula is the basis for all probability calculations in quantum mechanics. In textbooks it is usually stated as a separate axiom, but it has also been argued for by using various sets of assumptions; see [12] for some references. Here I will base the discussion upon the result of Section 8.

I begin with a recent result by Busch [13], giving a new version of a classical mathematical theorem by Gleason. Busch's version has the advantage that it is valid for a Hilbert space of dimension 2, which Gleason's original theorem is not, and it also has a simpler proof. For a proof for the finite-dimensional case, see Appendix 5 of [2].

Let in general *H* be any Hilbert space. Recall that an effect *E* is any operator on the Hilbert space with eigenvalues in the range [0, 1]. A generalized probability measure *µ* is a function on the effects with the properties

$$\begin{array}{l} \text{(1)} \ 0 \le \mu(E) \le 1 \text{ for all } E\_1\\ \text{(2)} \ \mu(I) = 1, \\ \text{(3)} \ \mu(E\_1 + E\_2 + \dots) = \mu(E\_1) + \mu(E\_2) + \dots \text{ whenever } E\_1 + E\_2 + \dots \le I. \end{array}$$

*Theorem 7. (Busch, 2003).*

*Any generalized probability measure µ is of the form µ*(*E*) = Tr(*σE*) *for some density operator σ.*

It is now easy to see that *q*(*E*|*τ*) = *p*(*z*|*τ*) on the ideal likelihood effects of Section 8 is a generalized probability measure if Assumption D holds: (1) follows since *q* is a probability; (2) since *E* = *I* implies that the likelihood is 1 for all values of the e-variable, hence *p*(*z*) = 1; finally (3) is a concequence of the corollary of Theorem 6. Hence there is a density operator *σ* = *σ*(*τ*) such that *p*(*z*|*τ*) = Tr(*σ*(*τ*)*E*) for all ideal likelihood effects *E* = *E*(*z*).

Define now a *perfect experiment* as one where the measurement uncertainty can be disregarded. The quantum mechanical literature operates very much with perfect experiments which give well-defined states |*k*�. From the point of view of statistics, if, say the 99% confidence or credibility region of *θ<sup>b</sup>* is the single point *u<sup>b</sup> <sup>k</sup>*, we can infer approximately that a perfect experiment has given the result *θ<sup>b</sup>* = *u<sup>b</sup> k*.

In our symmetric epistemic setting then: We have asked the question: 'What is the value of the accessible e-variable *θb*?', and are interested in finding the probability of the answer *θ<sup>b</sup>* = *u<sup>b</sup> <sup>j</sup>* though a perfect experiment. This is the probability of the state <sup>|</sup>*b*; *<sup>j</sup>*�. Assume now that this probability is sought in a context *τ* = *τa*,*<sup>k</sup>* defined as follows: We have previous knowledge of the answer *θ<sup>a</sup>* = *u<sup>a</sup> <sup>k</sup>* of another accessible question: What is the value of *<sup>θ</sup>a*? That is, we know the state |*a*; *k*�. If *θ<sup>a</sup>* is maximally accessible, this is the maximal knowledge about the system that *τ* may contain; in general we assume that the context *τ* does not contain more information about this system. It can contain irrelevant information, however.

#### *Theorem 8. (Born's formula)*

*Assume a rational epistemic setting. In the above situation we have:*

$$P(\theta^b = \mathfrak{u}\_j^b | \theta^a = \mathfrak{u}\_k^a) = |\langle a; k | b; j \rangle|^2.$$

*Proposition 2.*

*For electron spin we have*

using the Pauli spin matrices is also given in [5].

**10. Entanglement, EPR and the Bell theorem**

<sup>|</sup>0� <sup>=</sup> <sup>1</sup> √2

quantum mechanical calculations, this state can be explicitly written as

signals cannot go faster that the speed of light by the theory of relativity.

*<sup>P</sup>*(*θ<sup>b</sup>* <sup>=</sup> <sup>±</sup>1|*θ<sup>a</sup>* = +1) = <sup>1</sup>

2

This is proved in several textbooks, for instance [4], from Born's formula. A similar proof

Finally, using Born's formula, we deduce the basic formula for quantum measurement. Let the state of a system be given by a density matrix *ρ* = *ρ*(*η*), where *η* is an unknown statistical parameter, and let the measurements be determined by an operator-valued measure *M*(·) as

*P*(*B*; *η*) = Tr(*ρ*(*η*)*M*(*B*)).

The total spin components in different directions for a system of two spin 1/2 particles satisfy the assumptions of a maximal symmetric epistemic setting. Assume that we have such a system where *j* = 0, that is, the state is such that the total spin is zero. By ordinary

where |1, +�⊗|2, −� is a state where particle 1 has a spin component +*h*¯ /2 and particle 2 has a spin component −*h*¯ /2 along the *z*-axis, and *vice versa* for |1, −� ⊗ |2, +�. This is what is called an entangled state, that is, a state which is not a direct product of the component state vectors. I will follow my own programme, however, and stick to the e-variable description. Assume further that the two particles separate, the spin component of particle 1 is measured in some direction by an observer Alice, and the spin component of particle 2 is measured by an observer Bob. Before the experiment, the two observers agree both either to measure spin in some fixed direction *a* or in another fixed direction *b*, orthogonal to *a*, both measurements assumed for simplicity to be perfect. As a final assumption, let the positions of the two observers at the time of measurement be spacelike, that is, the distance between them is so large that no signal can reach from one to the other at this time, taking into account that

This is Bohm's version of the situation behind the argument against the completeness of quantum mechanics as posed by Einstein et al. [14] and countered by Bohr [15], [16]. This discussion is still sometimes taken up today, although most physicists now support Bohr. So will I, but I will go a step further. The main thesis in [14] was as follows: *If, without in any*

defined in Section 7. Then the probability distribution of the observations is given by

This, together with an assumption on the state after measurement, is the basis for [9].

(1 ± cos(*a* · *b*)).

A Basis for Statistical Theory and Quantum Theory

http://dx.doi.org/10.5772/53702

355

(|1, +�⊗|2, −� − |1, −� ⊗ |2, +�), (5)

Proof. Fix *j* and *k*, let |*v*� be either |*a*; *k*� or |*b*; *j*�, and consider likelihood effects of the form *E* = |*v*��*v*|. This corresponds in both cases to a perfect measurement of a maximally accessible parameter with a definite result. By Theorem 7 there exists a density operator *<sup>σ</sup>a*,*<sup>k</sup>* <sup>=</sup> <sup>∑</sup>*<sup>i</sup> <sup>π</sup>i*(*τa*,*k*)|*i*��*i*<sup>|</sup> such that *<sup>q</sup>*(*E*|*τa*,*k*) = �*v*|*σa*,*k*|*v*�, where *<sup>π</sup>i*(*τa*,*k*) are non-negative constants adding to 1. Consider first |*v*� = |*a*; *k*�. For this case one must have <sup>∑</sup>*<sup>i</sup> <sup>π</sup>i*(*τa*,*k*)|�*i*|*a*; *<sup>k</sup>*�|<sup>2</sup> <sup>=</sup> 1 and thus <sup>∑</sup>*<sup>i</sup> <sup>π</sup>i*(*τa*,*k*)(<sup>1</sup> − |�*i*|*a*; *<sup>k</sup>*�|2) = 0. This implies for each *<sup>i</sup>* that either *<sup>π</sup>i*(*τa*,*k*) = 0 or |�*i*|*a*; *<sup>k</sup>*�| = 1. Since the last condition implies |*i*� = |*a*; *<sup>k</sup>*� (modulus an irrelevant phase factor), and this is a condition which can only be true for one *i*, it follows that *πi*(*τa*,*k*) = 0 for all other *i* than this one, and that *πi*(*τa*,*k*) = 1 for this particular *i*. Summarizing this, we get *σa*,*<sup>k</sup>* = |*a*; *k*��*a*; *k*|, and setting |*v*� = |*b*; *j*�, Born's formula follows, since *q*(*E*|*τa*,*k*) in this case is equal to the probability of the perfect result *θ<sup>b</sup>* = *u<sup>b</sup> j* .

#### **9.2. Consequences**

Here are three easy consequences of Born's formula:

*i*

(1) If the context of the system is given by the state |*a*; *k*�, and *A<sup>b</sup>* is the operator corresponding to the e-variable *θb*, then the expected value of a perfect measurement of *θ<sup>b</sup>* is �*a*; *k*|*Ab*|*a*; *k*�.

(2) If the context is given by a density operator *σ*, and *A* is the operator corresponding to the e-variable *θ*, then the expected value of a perfect measurement of *θ* is Tr(*σA*).

(3) In the same situation the expected value of a perfect measurement of *f*(*θ*) is Tr(*σ f*(*A*)).

Proof of (1):

$$\begin{aligned} \mathbb{E}(\theta^b|\theta^a=\mu\_k^a) &= \sum\_i \mu\_i^b P(\theta^b=\mu\_i^b|\theta^a=\mu\_k^a) \\\\ &= \sum \mu\_i^b \langle a;k|b;i\rangle \langle b;i|a;k\rangle = \langle a;k|A^b|a;k\rangle. \end{aligned}$$

These results give an extended interpretation of the operator *A* compared to what I gave in Section 5: There is a simple formula for all expectations in terms of the operator. On the other hand, the set of such expectations determine the state of the system. Also on the other hand: If *A* is specialized to an indicator function, we get back Born's formula, so the consequences are equivalent to this formula.

As an application of Born's formula, we give the transition probabilities for electron spin. I will, for a given direction *a*, define the e-variable *θ<sup>a</sup>* as +1 if the measured spin component by a perfect measurement for the electron is +*h*¯ /2 in this direction, *θ<sup>a</sup>* = −1 if the component is −*h*¯ /2. Assume that *a* and *b* are two directions in which the spin component can be measured. *Proposition 2.*

20 Quantum Mechanics

*Theorem 8. (Born's formula)*

**9.2. Consequences**

Proof of (1):

are equivalent to this formula.

*Assume a rational epistemic setting. In the above situation we have:*

Here are three easy consequences of Born's formula:

*P*(*θ<sup>b</sup>* = *u<sup>b</sup>*

*<sup>j</sup>* <sup>|</sup>*θ<sup>a</sup>* <sup>=</sup> *<sup>u</sup><sup>a</sup>*

since *q*(*E*|*τa*,*k*) in this case is equal to the probability of the perfect result *θ<sup>b</sup>* = *u<sup>b</sup>*

e-variable *θ*, then the expected value of a perfect measurement of *θ* is Tr(*σA*).

*<sup>k</sup>* ) = ∑ *i ub*

E(*θb*|*θ<sup>a</sup>* = *u<sup>a</sup>*

= ∑ *i ub*

Proof. Fix *j* and *k*, let |*v*� be either |*a*; *k*� or |*b*; *j*�, and consider likelihood effects of the form *E* = |*v*��*v*|. This corresponds in both cases to a perfect measurement of a maximally accessible parameter with a definite result. By Theorem 7 there exists a density operator *<sup>σ</sup>a*,*<sup>k</sup>* <sup>=</sup> <sup>∑</sup>*<sup>i</sup> <sup>π</sup>i*(*τa*,*k*)|*i*��*i*<sup>|</sup> such that *<sup>q</sup>*(*E*|*τa*,*k*) = �*v*|*σa*,*k*|*v*�, where *<sup>π</sup>i*(*τa*,*k*) are non-negative constants adding to 1. Consider first |*v*� = |*a*; *k*�. For this case one must have <sup>∑</sup>*<sup>i</sup> <sup>π</sup>i*(*τa*,*k*)|�*i*|*a*; *<sup>k</sup>*�|<sup>2</sup> <sup>=</sup> 1 and thus <sup>∑</sup>*<sup>i</sup> <sup>π</sup>i*(*τa*,*k*)(<sup>1</sup> − |�*i*|*a*; *<sup>k</sup>*�|2) = 0. This implies for each *<sup>i</sup>* that either *<sup>π</sup>i*(*τa*,*k*) = 0 or |�*i*|*a*; *<sup>k</sup>*�| = 1. Since the last condition implies |*i*� = |*a*; *<sup>k</sup>*� (modulus an irrelevant phase factor), and this is a condition which can only be true for one *i*, it follows that *πi*(*τa*,*k*) = 0 for all other *i* than this one, and that *πi*(*τa*,*k*) = 1 for this particular *i*. Summarizing this, we get *σa*,*<sup>k</sup>* = |*a*; *k*��*a*; *k*|, and setting |*v*� = |*b*; *j*�, Born's formula follows,

(1) If the context of the system is given by the state |*a*; *k*�, and *A<sup>b</sup>* is the operator corresponding to the e-variable *θb*, then the expected value of a perfect measurement of *θ<sup>b</sup>* is �*a*; *k*|*Ab*|*a*; *k*�.

(2) If the context is given by a density operator *σ*, and *A* is the operator corresponding to the

(3) In the same situation the expected value of a perfect measurement of *f*(*θ*) is Tr(*σ f*(*A*)).

*<sup>i</sup> <sup>P</sup>*(*θ<sup>b</sup>* <sup>=</sup> *<sup>u</sup><sup>b</sup>*

*<sup>i</sup>* �*a*; *<sup>k</sup>*|*b*; *<sup>i</sup>*��*b*; *<sup>i</sup>*|*a*; *<sup>k</sup>*� <sup>=</sup> �*a*; *<sup>k</sup>*|*Ab*|*a*; *<sup>k</sup>*�.

These results give an extended interpretation of the operator *A* compared to what I gave in Section 5: There is a simple formula for all expectations in terms of the operator. On the other hand, the set of such expectations determine the state of the system. Also on the other hand: If *A* is specialized to an indicator function, we get back Born's formula, so the consequences

As an application of Born's formula, we give the transition probabilities for electron spin. I will, for a given direction *a*, define the e-variable *θ<sup>a</sup>* as +1 if the measured spin component by a perfect measurement for the electron is +*h*¯ /2 in this direction, *θ<sup>a</sup>* = −1 if the component is −*h*¯ /2. Assume that *a* and *b* are two directions in which the spin component can be measured.

*<sup>i</sup>* <sup>|</sup>*θ<sup>a</sup>* <sup>=</sup> *<sup>u</sup><sup>a</sup> k* )

*<sup>k</sup>* ) = |�*a*; *<sup>k</sup>*|*b*; *<sup>j</sup>*�|2.

*j* .

*For electron spin we have*

$$P(\theta^b = \pm 1 | \theta^a = +1) = \frac{1}{2}(1 \pm \cos(a \cdot b)).$$

This is proved in several textbooks, for instance [4], from Born's formula. A similar proof using the Pauli spin matrices is also given in [5].

Finally, using Born's formula, we deduce the basic formula for quantum measurement. Let the state of a system be given by a density matrix *ρ* = *ρ*(*η*), where *η* is an unknown statistical parameter, and let the measurements be determined by an operator-valued measure *M*(·) as defined in Section 7. Then the probability distribution of the observations is given by

$$P(B; \eta) = \text{Tr}(\rho(\eta)M(B))\,.$$

This, together with an assumption on the state after measurement, is the basis for [9].

#### **10. Entanglement, EPR and the Bell theorem**

The total spin components in different directions for a system of two spin 1/2 particles satisfy the assumptions of a maximal symmetric epistemic setting. Assume that we have such a system where *j* = 0, that is, the state is such that the total spin is zero. By ordinary quantum mechanical calculations, this state can be explicitly written as

$$|0\rangle = \frac{1}{\sqrt{2}}(|1,+\rangle \otimes |2,-\rangle - |1,-\rangle \otimes |2,+\rangle),\tag{5}$$

where |1, +�⊗|2, −� is a state where particle 1 has a spin component +*h*¯ /2 and particle 2 has a spin component −*h*¯ /2 along the *z*-axis, and *vice versa* for |1, −� ⊗ |2, +�. This is what is called an entangled state, that is, a state which is not a direct product of the component state vectors. I will follow my own programme, however, and stick to the e-variable description.

Assume further that the two particles separate, the spin component of particle 1 is measured in some direction by an observer Alice, and the spin component of particle 2 is measured by an observer Bob. Before the experiment, the two observers agree both either to measure spin in some fixed direction *a* or in another fixed direction *b*, orthogonal to *a*, both measurements assumed for simplicity to be perfect. As a final assumption, let the positions of the two observers at the time of measurement be spacelike, that is, the distance between them is so large that no signal can reach from one to the other at this time, taking into account that signals cannot go faster that the speed of light by the theory of relativity.

This is Bohm's version of the situation behind the argument against the completeness of quantum mechanics as posed by Einstein et al. [14] and countered by Bohr [15], [16]. This discussion is still sometimes taken up today, although most physicists now support Bohr. So will I, but I will go a step further. The main thesis in [14] was as follows: *If, without in any* *way disturbing a system, we can predict the value of a physical quantity, then there exists an element of physical reality corresponding to this physical reality.* Bohr answered by introducing a strict interpretation of this criterion: *To ascribe reality to P, the measurement of an observable whose outcome allows for the prediction of P, must actually be performed, or one must give a description of how it can be performed.* Several authors have argued that Einstein's criterion of reality lead an assumption of non-locality: Signals between observers with a spacelike separation must travel faster than light. Recently, it has been shown in [17] that the possibility is open to interpret the non-locality theorems in the physical literature as arguments supporting the strict criterion of reality, rather than a violation of locality. I agree with this last interpretation.

in the physical literature this is stated as an assumption of realism and locality. This leads

*bη*

*<sup>d</sup>*) <sup>−</sup> *<sup>E</sup>*(*<sup>λ</sup>*

*aη*

*<sup>d</sup>*) <sup>≤</sup> <sup>2</sup> (7)

http://dx.doi.org/10.5772/53702

357

A Basis for Statistical Theory and Quantum Theory

*<sup>c</sup>*) + *<sup>E</sup>*(*<sup>λ</sup>*

On the other hand, using quantum-mechanical calculations, that is Born's formula, from the basic state (5), shows that *a*, *b*, *c* and *d* can be chosen such that Bell's inequality (7) is violated.

From our point of view the transition from (6) to (7) is not valid. One can not take the expectation term by term in equation (6). The *λ*'s and *η*'s are conceptual variables belonging to different observers. Any valid statistical expectation must take one of these observers as a point of departure. Look at (6) from Alice's point of view, for instance. She starts by tossing a coin. The outcome of this toss leads to some e-variable *λ* being measured in one of the directions *a* or *b*. This measurement is an epistemic process, and any prediction based upon this measurement is a new epistemic process. During these processes she must obey Conditionality principle 2 of Section 2. By this conditionality principle she should condition upon the outcome of the coin toss. So in any prediction she should condition upon the choice *a* or *b*. It is crucial for this argument that the prediction of an e-variable is an epistemic process, not a process where ordinary probability calculations can be immediately

By doing predictions from her measurement result, she can use Born's formula. Suppose that she measures *<sup>λ</sup><sup>a</sup>* and finds *<sup>λ</sup><sup>a</sup>* = +1, for instance. Then she can predict the value of *<sup>λ</sup><sup>c</sup>* and hence *<sup>η</sup><sup>c</sup>* = −*λc*. Thus she can (given the outcome *<sup>a</sup>* of the coin toss) compute the expectation of the first term (6). similarly, she can compute the expectation of the last term in (6). But there is no way in which she simultaneously can predict *<sup>λ</sup><sup>b</sup>* and *<sup>η</sup>d*. Hence the expectation of the second term (and also, similarly the third term) in (6) is for her meaningless. A similar conclusion is reached if the outcome of the coin toss gives *b*. And of course a similar conclusion is valid if we take Bob's point of view. Therefore the transition from (6) to (7) is not valid, not by non-locality, but by a simple use of the conditionality principle. This can also in some sense be called lack of realism: In this situation is it not meaningful to take expectation from the point of view of an impartial observer. By necessity one must see the

Entanglement is very important in modern applications of quantum mechanics, not least in quantum information theory, including quantum computation. It is also an important ingredient in the theory of decoherence [18], which explains why ordinary quantum effects are not usually visible on a larger scale. Decoherence theory shows the importance of the entanglement of each system with its environment. In particular, it leads in effect to the conclusion that all observers share common observations after decoherence between the system and its environment, and this can then be identified with the 'objective' aspects of the

So far I have looked at e-variables taking a finite discrete set of values, but the concept of an e-variable carries over to the continuous case. Consider the motion of a non-relativistic

formally to

used.

*E*(*λ aη*

*<sup>c</sup>*) + *<sup>E</sup>*(*<sup>λ</sup>*

situation from the point of view of one of the observers Alice or Bob.

world; which is also what the superior actor *D* of Section 8 would find.

**11. Position as an e-variable and the Schrödinger equation**

This is one of Bell's inequalities, called the CHSH inequality.

*bη*

This is also confirmed by numerous experiments with electrons and photons.

I will be very brief on this discussion here. Let *λ* be the spin component in units of ¯*h*/2 as measured by Alice, and let *η* be the spin component in the same units as measured by Bob. Alice has a free choice between measuring in the the directions *a* and in the direction *b*. In both cases, her probability is 1/2 for each of *λ* = ±1. If she measures *λ<sup>a</sup>* = +1, say, she will predict *η<sup>a</sup>* = −1 for the corresponding component measured by Bob. According to Einstein et al. [14] there should then be an element of reality corresponding to this prediction, but if we adapt the strict interpretation of Bohr here, there is no way in which Alice can predict Bob's actual real measurement at this point of time. Bob on his side has also a free choice of measurement direction *a* or *b*, and in both cases he has the probability 1/2 for each of *η* = ±1. The variables *λ* and *η* are conceptual, the first one connected to Alice and the second one connected to Bob. As long as the two are not able to communicate, there is no sense in which we can make statements like *η* = −*λ* meaningful.

The situation changes. however, if Alice and Bob meet at some time after the measurement. If Alice then says 'I chose to make a measurement in the direction *a* and got the result *u*' and Bob happens to say 'I also chose to make a measurement in the direction *a*, and then I got the result *v*', then these two statements must be consistent: *v* = −*u*. This seems to be a necessary requirement for the consistency of the theory. There is a subtle distinction here. The clue is that the choices of measurement direction both for Alice and for Bob are free and independent. The directions are either equal or different. If they should happen to be different, there is no consistency requirement after the measurement, due to the assumed orthogonality of *a* and *b*. Note again that we have an epistemic interpretation of quantum mechanics. At the time of measurement, nothing exists except the observations by the two observers.

Let us then look at the more complicated situation where *a* and *b* are not necessarily orthogonal, where Alice tosses a coin and measures in the direction *a* if head and *b* if tail, while Bob tosses an independent coin and measures in some direction *c* if head and in another direction *d* if tail. Then there is an algebraic inequality

$$
\lambda^a \eta^c + \lambda^b \eta^c + \lambda^b \eta^d - \lambda^a \eta^d \le 2. \tag{6}
$$

Since all the conceptual variables take values ±1, this inequality follows from

$$(\lambda^a + \lambda^b)\eta^c + (\lambda^b - \lambda^a)\eta^d = \pm 2 \le 2\dots$$

Now replace the conceptual variables here with actual measurements. Taking then formal expectations from (6), assumes that the products here have meaning as random variables; in the physical literature this is stated as an assumption of realism and locality. This leads formally to

$$E(\widehat{\lambda^a}\widehat{\eta^c}) + E(\widehat{\lambda^b}\widehat{\eta^c}) + E(\widehat{\lambda^b}\widehat{\eta^d}) - E(\widehat{\lambda^d}\widehat{\eta^d}) \le 2 \tag{7}$$

This is one of Bell's inequalities, called the CHSH inequality.

22 Quantum Mechanics

observers.

*way disturbing a system, we can predict the value of a physical quantity, then there exists an element of physical reality corresponding to this physical reality.* Bohr answered by introducing a strict interpretation of this criterion: *To ascribe reality to P, the measurement of an observable whose outcome allows for the prediction of P, must actually be performed, or one must give a description of how it can be performed.* Several authors have argued that Einstein's criterion of reality lead an assumption of non-locality: Signals between observers with a spacelike separation must travel faster than light. Recently, it has been shown in [17] that the possibility is open to interpret the non-locality theorems in the physical literature as arguments supporting the strict criterion of reality, rather than a violation of locality. I agree with this last interpretation. I will be very brief on this discussion here. Let *λ* be the spin component in units of ¯*h*/2 as measured by Alice, and let *η* be the spin component in the same units as measured by Bob. Alice has a free choice between measuring in the the directions *a* and in the direction *b*. In both cases, her probability is 1/2 for each of *λ* = ±1. If she measures *λ<sup>a</sup>* = +1, say, she will predict *η<sup>a</sup>* = −1 for the corresponding component measured by Bob. According to Einstein et al. [14] there should then be an element of reality corresponding to this prediction, but if we adapt the strict interpretation of Bohr here, there is no way in which Alice can predict Bob's actual real measurement at this point of time. Bob on his side has also a free choice of measurement direction *a* or *b*, and in both cases he has the probability 1/2 for each of *η* = ±1. The variables *λ* and *η* are conceptual, the first one connected to Alice and the second one connected to Bob. As long as the two are not able to communicate, there is no

The situation changes. however, if Alice and Bob meet at some time after the measurement. If Alice then says 'I chose to make a measurement in the direction *a* and got the result *u*' and Bob happens to say 'I also chose to make a measurement in the direction *a*, and then I got the result *v*', then these two statements must be consistent: *v* = −*u*. This seems to be a necessary requirement for the consistency of the theory. There is a subtle distinction here. The clue is that the choices of measurement direction both for Alice and for Bob are free and independent. The directions are either equal or different. If they should happen to be different, there is no consistency requirement after the measurement, due to the assumed orthogonality of *a* and *b*. Note again that we have an epistemic interpretation of quantum mechanics. At the time of measurement, nothing exists except the observations by the two

Let us then look at the more complicated situation where *a* and *b* are not necessarily orthogonal, where Alice tosses a coin and measures in the direction *a* if head and *b* if tail, while Bob tosses an independent coin and measures in some direction *c* if head and in

(*λ<sup>a</sup>* + *λb*)*η<sup>c</sup>* + (*λ<sup>b</sup>* − *λa*)*η<sup>d</sup>* = ±2 ≤ 2.

Now replace the conceptual variables here with actual measurements. Taking then formal expectations from (6), assumes that the products here have meaning as random variables;

*λaη<sup>c</sup>* + *λbη<sup>c</sup>* + *λbη<sup>d</sup>* − *λaη<sup>d</sup>* ≤ 2. (6)

sense in which we can make statements like *η* = −*λ* meaningful.

another direction *d* if tail. Then there is an algebraic inequality

Since all the conceptual variables take values ±1, this inequality follows from

On the other hand, using quantum-mechanical calculations, that is Born's formula, from the basic state (5), shows that *a*, *b*, *c* and *d* can be chosen such that Bell's inequality (7) is violated. This is also confirmed by numerous experiments with electrons and photons.

From our point of view the transition from (6) to (7) is not valid. One can not take the expectation term by term in equation (6). The *λ*'s and *η*'s are conceptual variables belonging to different observers. Any valid statistical expectation must take one of these observers as a point of departure. Look at (6) from Alice's point of view, for instance. She starts by tossing a coin. The outcome of this toss leads to some e-variable *λ* being measured in one of the directions *a* or *b*. This measurement is an epistemic process, and any prediction based upon this measurement is a new epistemic process. During these processes she must obey Conditionality principle 2 of Section 2. By this conditionality principle she should condition upon the outcome of the coin toss. So in any prediction she should condition upon the choice *a* or *b*. It is crucial for this argument that the prediction of an e-variable is an epistemic process, not a process where ordinary probability calculations can be immediately used.

By doing predictions from her measurement result, she can use Born's formula. Suppose that she measures *<sup>λ</sup><sup>a</sup>* and finds *<sup>λ</sup><sup>a</sup>* = +1, for instance. Then she can predict the value of *<sup>λ</sup><sup>c</sup>* and hence *<sup>η</sup><sup>c</sup>* = −*λc*. Thus she can (given the outcome *<sup>a</sup>* of the coin toss) compute the expectation of the first term (6). similarly, she can compute the expectation of the last term in (6). But there is no way in which she simultaneously can predict *<sup>λ</sup><sup>b</sup>* and *<sup>η</sup>d*. Hence the expectation of the second term (and also, similarly the third term) in (6) is for her meaningless. A similar conclusion is reached if the outcome of the coin toss gives *b*. And of course a similar conclusion is valid if we take Bob's point of view. Therefore the transition from (6) to (7) is not valid, not by non-locality, but by a simple use of the conditionality principle. This can also in some sense be called lack of realism: In this situation is it not meaningful to take expectation from the point of view of an impartial observer. By necessity one must see the situation from the point of view of one of the observers Alice or Bob.

Entanglement is very important in modern applications of quantum mechanics, not least in quantum information theory, including quantum computation. It is also an important ingredient in the theory of decoherence [18], which explains why ordinary quantum effects are not usually visible on a larger scale. Decoherence theory shows the importance of the entanglement of each system with its environment. In particular, it leads in effect to the conclusion that all observers share common observations after decoherence between the system and its environment, and this can then be identified with the 'objective' aspects of the world; which is also what the superior actor *D* of Section 8 would find.

#### **11. Position as an e-variable and the Schrödinger equation**

So far I have looked at e-variables taking a finite discrete set of values, but the concept of an e-variable carries over to the continuous case. Consider the motion of a non-relativistic one-dimensional particle. Its position *ξ* at some time *t* can in principle be determined by arbitrary accuracy, resulting in an arbitrarily short confidence interval. But momentum and hence velocity cannot be determined simultaneously with arbitrary accuracy, hence the vector (*ξ*(*s*), *ξ*(*t*)) for positions at two different time points is inaccessible. Now fix some time *t*. An observer *<sup>i</sup>* may predict *<sup>ξ</sup>*(*t*) by conditioning on some *<sup>σ</sup>*-algebra P*<sup>i</sup>* of information from the past. This may be information from some time point *si* < *t*, but it can also take other forms. We must think of different observers as hypothetical; only one of them can be realized. Nevertheless one can imagine that all this information, subject to the choice of observer later, is collected in an inaccessible *σ*-algebra P*t*, the past of *ξ*(*t*). The distribution of *ξ*(*t*), given the past P*t*, for each *t*, can then be represented as a stochastic process.

In the simplest case one can then imagine {*ξ*(*s*);*s* ≥ 0} as an inaccessible Markov process: The future is independent of the past, given the present. Under suitable regularity conditions, a continuous Markov process will be a diffusion process, i.e., a solution of a stochastic differential equation of the type

$$d\mathfrak{f}(t) = b(\mathfrak{f}(t), t)dt + \sigma(\mathfrak{f}(t), t)dw(t). \tag{8}$$

he assumed that these quantities vary inversely with mass *m*, that is, *σ*<sup>2</sup> = *σ*<sup>2</sup>

2*π*. This assumes that *σ*<sup>2</sup> = *σ*<sup>2</sup>

where *ρ* = *ρ*(*x*, *t*) is the probability density of *ξ*(*t*).

of the particle by *f* = exp(*R* + *iS*). Then | *f*(*x*, *t*)|

*ih*¯ *∂ ∂t*

<sup>⋆</sup> Address all correspondence to: ingeh@math.uio.no

Department of Mathematics, University of Oslo, Blindern, Oslo, Norway

*<sup>f</sup>*(*x*, *<sup>t</sup>*)=[ <sup>1</sup>

properly, one deduces from these equations

more details in [20]. Identifying −*ih*¯ *<sup>∂</sup>*

Schrödinger equation for the particle.

**12. Conclusion**

mechanics.

**Author details** Inge S. Helland

**13. References**

Scientific.

arXiv:1206.5075.

Introduce *u* = (*b* − *b*∗)/2 and *v* = (*b* + *b*∗)/2. Then *R* = <sup>1</sup>

constant ¯*h* has dimension action, and turns out to be equal to Planck's constant divided by

*b*<sup>∗</sup> = *b* − *σ*2(ln*ρ*)*x*,

Let *S* be defined up to an additive constant by *Sx* = *mv*/¯*h* and define the wave function

acceleration of the particle in a proper way and using Newton's second law, a set of partial differential equations for *u* and *v* can be found, and by choosing the additive constant in *S*

> <sup>2</sup>*<sup>m</sup>* (−*ih*¯ *<sup>∂</sup> ∂x*

where *V*(*x*) is the potential energy. The details of these derivations can be found in [2] with

Even though the mathematics here is more involved, the approach of the present chapter (expressed in more detail in [2]) should serve to take some of the mystery off the ordinary formal introduction to quantum theory. A challenge for the future will be to develop the corresponding relativistic theory, by using representations of the Poincaré group together with an argument like that in Section 11. Also, one should seek a link to elementary particle physics using the relevant Lie group theory. Group theory is an important part of physics, and it should come as no surprise that this also is relevant to the foundation of quantum

[1] Ballentine, L. E. (1998). *Quantum Mechanics. A Modern Development.* Singapore: World

[2] Helland, I. S. (2012). *A Unified Scientific Basis for Inference.* Submitted as a Springer Brief;

∗ , a fact that Nelson actually proved in addition to proving that

∗ = *h*¯ /*m*. The

359

<sup>2</sup> ln*ρ*(*x*, *t*) satisfies *Rx* = *mu*/¯*h*.

<sup>2</sup> = *ρ*(*x*, *t*) as it should. By defining the

A Basis for Statistical Theory and Quantum Theory

http://dx.doi.org/10.5772/53702

)<sup>2</sup> + *V*(*x*)] *f*(*x*, *t*), (10)

*<sup>∂</sup><sup>x</sup>* as the operator for momentum, we see that (10) is the

Here *b*(·, ·) and *σ*(·, ·) are continuous functions, also assumed differentiable, and {*w*(*t*); *t* ≥ 0} is a Wiener process. The Wiener process is a stochastic process with continuous paths, independent increments *w*(*t*) − *w*(*s*), *w*(0) = 0 and *E*((*w*(*t*) − *w*(*s*))2) = *t* − *s*. Many properties of the Wiener process have been studied, including the fact that its paths are nowhere differentiable. The stochastic differential equation (8) must therefore be defined in a particular way; for an introduction to Itô calculus or Stochastic calculus; see for instance [19].

So far we have considered observers making predictions of the present value *ξ*(*t*), given the past P*t*. There is another type of epistemic processes which can be described as follows: Imagine an actor A which considers some future event for the particle, lying in a *<sup>σ</sup>*-algebra F*j*. He asks himself in which position he should place the particle at time *t* as well as possible in order to have this event fulfilled. In other words, he can adjust *ξ*(*t*) for this purpose. Again one can collect the *σ*-algebras for the different potential actors in one big inaccessible *σ*-algebra F*t*, the future after *t*. The conditioning of the present, given the future, defines {*ξ*(*t*); *t* ≥ 0} as a new inaccessible stochastic process, with now *t* running backwards in time. In the simplest case this is a Markov process, and can be described by a stochastic differential equation

$$d\tilde{\xi}(t) = b\_\* \left( \tilde{\xi}(t), t \right) dt + \sigma\_\* \left( \tilde{\xi}(t), t \right) dw\_\*(t), \tag{9}$$

where again *w*∗(*t*) is a Wiener process.

Without having much previous knowledge about modern stochastic analysis and without knowing anything about epistemic processes, Nelson [20] formulated his stochastic mechanics, which serves our purpose perfectly. Nelson considered the multidimensional case, but for simplicity, I will here only discuss a one-dimensional particle. Everything can be generalized.

Nelson discussed what corresponds to the stochastic differential equations (8) and (9) with *σ* and *σ*∗ constant in space and time. Since heavy particles fluctuate less than light particles, he assumed that these quantities vary inversely with mass *m*, that is, *σ*<sup>2</sup> = *σ*<sup>2</sup> ∗ = *h*¯ /*m*. The constant ¯*h* has dimension action, and turns out to be equal to Planck's constant divided by 2*π*. This assumes that *σ*<sup>2</sup> = *σ*<sup>2</sup> ∗ , a fact that Nelson actually proved in addition to proving that

$$b\_\* = b - \sigma^2 (\ln \rho)\_{\ge \nu}$$

where *ρ* = *ρ*(*x*, *t*) is the probability density of *ξ*(*t*).

Introduce *u* = (*b* − *b*∗)/2 and *v* = (*b* + *b*∗)/2. Then *R* = <sup>1</sup> <sup>2</sup> ln*ρ*(*x*, *t*) satisfies *Rx* = *mu*/¯*h*. Let *S* be defined up to an additive constant by *Sx* = *mv*/¯*h* and define the wave function of the particle by *f* = exp(*R* + *iS*). Then | *f*(*x*, *t*)| <sup>2</sup> = *ρ*(*x*, *t*) as it should. By defining the acceleration of the particle in a proper way and using Newton's second law, a set of partial differential equations for *u* and *v* can be found, and by choosing the additive constant in *S* properly, one deduces from these equations

$$i\hbar\frac{\partial}{\partial t}f(\mathbf{x},t) = [\frac{1}{2m}(-i\hbar\frac{\partial}{\partial x})^2 + V(\mathbf{x})]f(\mathbf{x},t),\tag{10}$$

where *V*(*x*) is the potential energy. The details of these derivations can be found in [2] with more details in [20]. Identifying −*ih*¯ *<sup>∂</sup> <sup>∂</sup><sup>x</sup>* as the operator for momentum, we see that (10) is the Schrödinger equation for the particle.

#### **12. Conclusion**

24 Quantum Mechanics

differential equation of the type

[19].

equation

be generalized.

where again *w*∗(*t*) is a Wiener process.

one-dimensional particle. Its position *ξ* at some time *t* can in principle be determined by arbitrary accuracy, resulting in an arbitrarily short confidence interval. But momentum and hence velocity cannot be determined simultaneously with arbitrary accuracy, hence the vector (*ξ*(*s*), *ξ*(*t*)) for positions at two different time points is inaccessible. Now fix some time *t*. An observer *<sup>i</sup>* may predict *<sup>ξ</sup>*(*t*) by conditioning on some *<sup>σ</sup>*-algebra P*<sup>i</sup>* of information from the past. This may be information from some time point *si* < *t*, but it can also take other forms. We must think of different observers as hypothetical; only one of them can be realized. Nevertheless one can imagine that all this information, subject to the choice of observer later, is collected in an inaccessible *σ*-algebra P*t*, the past of *ξ*(*t*). The distribution of *ξ*(*t*), given

In the simplest case one can then imagine {*ξ*(*s*);*s* ≥ 0} as an inaccessible Markov process: The future is independent of the past, given the present. Under suitable regularity conditions, a continuous Markov process will be a diffusion process, i.e., a solution of a stochastic

Here *b*(·, ·) and *σ*(·, ·) are continuous functions, also assumed differentiable, and {*w*(*t*); *t* ≥ 0} is a Wiener process. The Wiener process is a stochastic process with continuous paths, independent increments *w*(*t*) − *w*(*s*), *w*(0) = 0 and *E*((*w*(*t*) − *w*(*s*))2) = *t* − *s*. Many properties of the Wiener process have been studied, including the fact that its paths are nowhere differentiable. The stochastic differential equation (8) must therefore be defined in a particular way; for an introduction to Itô calculus or Stochastic calculus; see for instance

So far we have considered observers making predictions of the present value *ξ*(*t*), given the past P*t*. There is another type of epistemic processes which can be described as follows: Imagine an actor A which considers some future event for the particle, lying in a *<sup>σ</sup>*-algebra F*j*. He asks himself in which position he should place the particle at time *t* as well as possible in order to have this event fulfilled. In other words, he can adjust *ξ*(*t*) for this purpose. Again one can collect the *σ*-algebras for the different potential actors in one big inaccessible *σ*-algebra F*t*, the future after *t*. The conditioning of the present, given the future, defines {*ξ*(*t*); *t* ≥ 0} as a new inaccessible stochastic process, with now *t* running backwards in time. In the simplest case this is a Markov process, and can be described by a stochastic differential

Without having much previous knowledge about modern stochastic analysis and without knowing anything about epistemic processes, Nelson [20] formulated his stochastic mechanics, which serves our purpose perfectly. Nelson considered the multidimensional case, but for simplicity, I will here only discuss a one-dimensional particle. Everything can

Nelson discussed what corresponds to the stochastic differential equations (8) and (9) with *σ* and *σ*∗ constant in space and time. Since heavy particles fluctuate less than light particles,

*dξ*(*t*) = *b*(*ξ*(*t*), *t*)*dt* + *σ*(*ξ*(*t*), *t*)*dw*(*t*). (8)

*dξ*(*t*) = *b*∗(*ξ*(*t*), *t*)*dt* + *σ*∗(*ξ*(*t*), *t*)*dw*∗(*t*), (9)

the past P*t*, for each *t*, can then be represented as a stochastic process.

Even though the mathematics here is more involved, the approach of the present chapter (expressed in more detail in [2]) should serve to take some of the mystery off the ordinary formal introduction to quantum theory. A challenge for the future will be to develop the corresponding relativistic theory, by using representations of the Poincaré group together with an argument like that in Section 11. Also, one should seek a link to elementary particle physics using the relevant Lie group theory. Group theory is an important part of physics, and it should come as no surprise that this also is relevant to the foundation of quantum mechanics.

### **Author details**

Inge S. Helland

<sup>⋆</sup> Address all correspondence to: ingeh@math.uio.no

Department of Mathematics, University of Oslo, Blindern, Oslo, Norway

#### **13. References**


[3] Birnbaum, A. (1962). On the foundation of statistical inference. *Journal of the American Statistical Association* 57, 269-326.

**Chapter 16**

**Provisional chapter**

**Relational Quantum Mechanics**

Additional information is available at the end of the chapter

**Relational Quantum Mechanics**

Additional information is available at the end of the chapter

– universal, syntax – semantics, ontological – epistemological.

Quantum mechanics (QM) stands out as the theory of the 20th century, shaping the most diverse phenomena, from subatomic physics to cosmology. All quantum predictions have been crowned with full success and utmost accuracy. Yet, the admiration we feel towards QM is mixed with surprise and uneasiness. QM defies common sense and common logic. Various paradoxes, including Schrodinger's cat and EPR paradox, exemplify the lurking conflict. The reality of the problem is confirmed by the Bell's inequalities and the GHZ equalities. We are thus led to revisit a number of old interlocked oppositions: operator – operand, discrete – continuous, finite –infinite, hardware – software, local – global, particular

The logic of a physical theory reflects the structure of the propositions describing the physical system under study. The propositional logic of classical mechanics is Boolean logic, which is based on set theory. A set theory is deprived of any structure, being a plurality of structure-less individuals, qualified only by membership (or non-membership). Accordingly a set-theoretic enterprise is analytic, atomistic, arithmetic. It was noticed as early as 1936 by Neumann and Birkhoff that the quantum real needs a non-Boolean logical structure. On numerous cases the need for a novel system of logical syntax is evident. Quantum measurement bypasses the old disjunctions subject-object, observer-observed. The observer affects the system under observation and the borderline between ontological and epistemological is blurred. Correlations are not anymore local and a quantum system embodies multiple entanglements. The particular-universal dichotomy is also under revision. While a single quantum event is particular, a plethora of quantum events leads to universal patterns. Viewing the quantum system as a system encoding information, we understand that the usual distinction between hardware and software is not relevant. Most importantly, if we consider the opposing terms being-becoming, we realize that the emphasis is sifted to the becoming, the movement, the process. The underlying dynamics is governed by relational

> ©2012 Nicolaidis, licensee InTech. This is an open access chapter distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/3.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. © 2013 Nicolaidis; licensee InTech. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/3.0), which permits unrestricted use,

© 2013 Nicolaidis; licensee InTech. This is a paper distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/3.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

distribution, and reproduction in any medium, provided the original work is properly cited.

A. Nicolaidis

A. Nicolaidis

http://dx.doi.org/10.5772/54892

**1. Introduction**


### **Chapter 16**

**Provisional chapter**
