### abstract ###
In this paper, we consider the coherent theory of (epistemic) uncertainty of Walley, in which beliefs are represented through sets of probability distributions, and we focus on the problem of modeling prior ignorance about a categorical random variable
In this setting, it is a known result that a state of prior ignorance is not compatible with learning
To overcome this problem, another state of beliefs, called  near-ignorance , has been proposed
Near-ignorance resembles ignorance very closely, by satisfying some principles that can arguably be regarded as necessary in a state of ignorance, and allows learning to take place
What this paper does, is to provide new and substantial evidence that also near-ignorance cannot be really regarded as a way out of the problem of starting statistical inference in conditions of very weak beliefs
The key to this result is focusing on a setting characterized by a variable of interest that is  latent
We argue that such a setting is by far the most common case in practice, and we provide, for the case of categorical latent variables (and general  manifest  variables) a condition that, if satisfied, prevents learning to take place under prior near-ignorance
This condition is shown to be easily satisfied even in the most common statistical problems
We regard these results as a strong form of evidence against the possibility to adopt a condition of prior near-ignorance in real statistical problems
### introduction ###
Epistemic theories of statistics are often confronted with the question of  prior ignorance
Prior ignorance means that a subject, who is about to perform a statistical analysis, is missing substantial beliefs about the underlying data-generating process
Yet, the subject would like to exploit the available sample to draw some statistical conclusion, i e , the subject would like to use the data to learn, moving away from the initial condition of ignorance
This situation is very important as it is often desirable to start a statistical analysis with weak assumptions about the problem of interest, thus trying to implement an objective-minded approach to statistics
A fundamental question is whether prior ignorance is compatible with learning or not
Walley gives a negative answer for the case of his self-consistent (or  coherent ) theory of statistics based on the modeling of beliefs through sets of probability distributions
He shows, in a very general sense, that  vacuous  prior beliefs, i e , beliefs that a priori are maximally imprecise, lead to vacuous posterior beliefs, irrespective of the type and amount of observed data  CITATION
At the same time, he proposes focusing on a slightly different state of beliefs, called  near-ignorance , that does enable learning to take place  CITATION
Loosely speaking, near-ignorant beliefs are beliefs that are vacuous for a proper subset of the functions of the random variables under consideration (see Section~)
In this way, a near-ignorance prior still gives one the possibility to express vacuous beliefs for some functions of interest, and at the same time it maintains the possibility to learn from data
The fact that learning is possible under prior near-ignorance is shown, for instance, in the special case of the  imprecise Dirichlet model  (IDM)  CITATION
This is a popular model, based on a near-ignorance set of priors, used in the case of inference from categorical data generated by a multinomial process
Our aim in this paper is to investigate whether near-ignorance can be really regarded as a possible way out of the problem of starting statistical inference in conditions of very weak beliefs
We carry out this investigation in a setting made of categorical data generated by a multinomial process, like in the IDM, but we consider near-ignorance sets of priors in general, not only that used in the IDM
The interest in this investigation is motivated by the fact that near-ignorance sets of priors appear to play a crucially important role in the question of modeling prior ignorance about a categorical random variable
The key point is that near-ignorance sets of priors can be made to satisfy two principles: the  symmetry  and the  embedding principles
The first is well known and is equivalent to Laplace's  indifference principle ; the second states, loosely speaking, that if we are ignorant a priori, our prior beliefs on an event of interest should not depend on the space of possibilities in which the event is embedded (see Section~ for a discussion about these two principles)
Walley  CITATION , and later de Cooman and Miranda  CITATION , have argued extensively on the necessity of both the symmetry and the embedding principles in order to characterize a condition of ignorance about a categorical random variable
This implies, if we agree that the symmetry and the embedding principles are necessary for ignorance, that near-ignorance sets of priors should be regarded as an especially important avenue for a subject who wishes to learn starting in a condition of ignorance
Our investigation starts by focusing on a setting where the categorical variable  SYMBOL  under consideration is  latent
This means that we cannot observe the realizations of  SYMBOL , so that we can learn about it only by means of another, not necessarily categorical, variable  SYMBOL , related to  SYMBOL  through a known conditional probability distribution  SYMBOL
Variable  SYMBOL  is assumed to be  manifest , in the sense that its realizations can be observed (see Section~)
The intuition behind the setup considered, made of  SYMBOL  and  SYMBOL , is that in many real cases it is not possible to directly observe the value of a random variable in which we are interested, for instance when this variable represents a patient's health and we are observing the result of a diagnostic test
In these cases, we need to use a manifest variable (the medical test) in order to obtain information about the original latent variable (the patient's health)
In this paper, we regard the passage from the latent to the manifest variable as made by a process that we call the  observational process
Using the introduced setup, we give a condition in Section~, related to the likelihood function, that is shown to be sufficient to prevent learning about  SYMBOL  under prior near-ignorance
The condition is very general as it is developed for any set of priors that models near-ignorance (thus including the case of the IDM), and for very general kinds of probabilistic relations between  SYMBOL  and  SYMBOL
We show then, by simple examples, that such a condition is easily satisfied, even in the most elementary and common statistical problems
In order to fully appreciate this result, it is important to realize that latent variables are ubiquitous in problems of uncertainty
The key point here is that the scope of observational processes greatly extends if we consider that even when we directly obtain the value of a variable of interest, what we actually obtain is the observation of the value rather than the value itself
Doing this distinction makes sense because in practice an observational process is usually imperfect, i e , there is very often (it could be argued that there is always) a positive probability of confounding the realized value of  SYMBOL  with another possible value committing thus an observation error
Of course, if the probability of an observation error is very small and we consider one of the common Bayesian model proposed to learn under prior ignorance, then there is little difference between the results provided by a latent variable model modeling correctly the observational process, and the results provided by a model where the observations are assumed to be perfect
For this reason, the observational process is often neglected in practice and the distinction between the latent variable and the manifest one is not enforced
But, on the other hand, if we consider sets of probability distributions to model our prior beliefs, instead of a single probability distribution, and in particular if we consider near-ignorance sets of priors, then there can be an extreme difference between a latent variable model and a model where the observations are considered to be perfect, so that learning may be impossible in the first model and possible in the second
As a consequence, when dealing with sets of probability distributions, neglecting the observational process may be no longer justified even if the probability of observation error is tiny
This is shown in a definite sense in Example~ of Section~, where we analyze the relevance of our results for the special case of the IDM
From the proofs in this paper, it follows that this kind of behavior is mainly determined by the presence, in the near-ignorance set of priors, of extreme, almost-deterministic, distributions
And the question is that these problematic distributions, which are usually not considered when dealing with Bayesian models with a single prior, cannot be ruled out without dropping near-ignorance
These considerations highlight the quite general applicability of the present results and raise hence serious doubts about the possibility to adopt a condition of prior near-ignorance in real, as opposed to idealized, applications of statistics
As a consequence, it may make sense to consider re-focusing the research about this subject on developing models of very weak states of belief that are, however, stronger than near-ignorance
This might also involve dropping the idea that both the symmetry and the embedding principles can be realistically met in practice
