Please note, this is the uncorrected text of the proofs.
Page numbers may not correspond to the printed
version.
Philosophy and Phenomenological Research
Vol. LXX, No. 2, March 2005
Is Semantic Information Meaningful
Data?
LUCIANO FLORIDI
Wolfson College
There is no consensus yet on the definition of semantic information. This paper contrib-
utes to the current debate by criticising and revising the Standard Definition of semantic
Information (SDI) as meaningful data, in favour of the Dretske-Grice approach:
meaningful and well-formed data constitute semantic information only if they also qual-
ify as contingently truthful. After a brief introduction, SDI is criticised for providing
necessary but insufficient conditions for the definition of semantic information. SDI is
incorrect because truth-values do not supervene on semantic information, and misin-
formation (that is, false semantic information) is not a type of semantic information, but
pseudo-information, that is not semantic information at all. This is shown by arguing that
none of the reasons for interpreting misinformation as a type of semantic information is
convincing, whilst there are compelling reasons to treat it as pseudo-information. As a
consequence, SDI is revised to include a necessary truth-condition. The last section
summarises the main results of the paper and indicates some interesting areas of appli-
cation of the revised definition.
1. Introduction
The concept of information has become central in most contemporary phi-
losophy. However, recent surveys have shown no consensus on a single, uni-
fied definition of semantic information.1 This is hardly surprising. Informa-
tion is such a powerful and elusive concept that, as an explicandum, it can be
associated with several explanations, depending on the cluster of requirements
and desiderata that orientate a theory.2 Claude Shannon, for example,
remarked that
The word “information” has been given different meanings by various writers in the general
field of information theory. It is likely that at least a number of these will prove sufficiently
useful in certain applications to deserve further study and permanent recognition. It is hardly to
be expected that a single concept of information would satisfactorily account for the numerous
possible applications of this general field. From “The Lattice Theory of Information”, in Shan-
non (1993) p. 180.
1
For a review of the literature and further information see Floridi [2002], [2003a],
[2003b].
2
The point is made explicit and defended in Bar-Hillel and Carnap [1953], Szaniawski
[1984] and Floridi [2002].
IS SEMANTIC INFORMATION MEANINGFUL DATA? 351
Polysemantic concepts such as information can be fruitfully analysed only in
relation to well-specified contexts of application. Following this localist
principle, only one crucial aspect of a specific type of information will be
analysed in this paper, namely the alethic nature of declarative, objective
and semantic (DOS) information (more on these qualifications in the next
section). The question addressed is whether alethic values are supervenient3 on
DOS information, as presumed by the standard definition of information
(SDI). The negative answer defended is that DOS information encapsulates
“truthfulness”, so that “true information” is simply redundant and “false
information”, i.e. misinformation, is merely pseudo-information. It follows
that SDI needs to be revised by adding a necessary truth-condition. Five areas
of application of the revised definition are briefly discussed in the last section.
2. The Standard Definition of Information
Intuitively, “information” is often used to refer to non-mental, user-independ-
ent, declarative (i.e. alethically qualifiable),4 semantic contents, embedded in
physical implementations like databases, encyclopaedias, web sites, televi-
sion programmes and so on, which can variously be produced, collected,
accessed and processed. The Cambridge Dictionary of Philosophy, for exam-
ple, defines information thus:
an objective (mind independent) entity. It can be generated or carried by messages (words,
sentences) or by other products of cognizers (interpreters). Information can be encoded and
transmitted, but the information would exist independently of its encoding or transmission.
The extensionalist analysis of this popular concept of DOS (declarative,
objective and semantic) information is not immediately connected to levels
of subjective uncertainty and ignorance, to probability distributions, to util-
ity-functions for decision-making processes, or to the analysis of communica-
tion processes. So the corresponding mathematical and pragmatic5 senses in
3
This technical term is used here to mean, weakly, “coming upon something subsequently,
as an extraneous addition”. The term is not used with the stronger meaning according to
which “if a set of properties x supervenes on another set of properties y, this means that
there is no variation with respect to x without a variation with respect to y”. I am grateful
to Philipp Keller for having prompted me to add this clarification.
4
There are many plausible contexts in which a stipulation (“let the value of x = 3” or
“suppose we discover the bones of a unicorn”), an invitation (“you are cordially invited
to the college party”), an order (“close the window!”), an instruction (“to open the box
turn the key”), a game move (“1.e2-e4 c7-c5” at the beginning of a chess game) may be
correctly qualified as kinds of information. These and other similar, non-declarative
meanings of “information” (e.g. to refer to a music file or to a digital painting) are not
discussed in this paper, where objective semantic information is taken to have a declara-
tive or factual value, i.e., it is suppose to be correctly qualifiable alethically.
5
See Bar-Hillel and Carnap [1953]. A pragmatic theory of information addresses the
question of how much information a certain message carries for a subject S in a given
doxastic state and within a specific informational environment.
352 LUCIANO FLORIDI
which one may speak of information are not relevant in this context and can
be disregarded.
Over the last three decades, most analyses have supported a definition of
DOS information in terms of data + meaning. Three quotations from a vari-
ety of influential texts well illustrate the popularity of the bipartite account:6
Information is data that has been processed into a form that is meaningful to the recipient.
Davis and Olson (1985), 200.
Data is the raw material that is processed and refined to generate information. Silver and Sil-
ver (1989), 6.
Information equals data plus meaning. Checkland and Scholes (1990), 303.
The bipartite account has gained sufficient consensus to become an opera-
tional standard in fields that tend to deal with data and information as reified
entities (consider, for example, the now common expression “data mining”),
especially Information Science; Information Systems Theory, Methodology,
Analysis and Design; Information (Systems) Management; Database Design;
and Decision Theory. More recently, the bipartite account has begun to influ-
ence the philosophy of computing and information as well (see for example
Chalmers [1996], Floridi [1999], Franklin [1995] and Mingers [1997]).
The practical utility of the bipartite account is indubitable. The question is
whether it is rigorous enough to be applied in the context of an information-
theoretical epistemology. We shall see that this is not the case, but before
moving any criticism, we need a more rigorous formulation.
2.1. An Analysis of the Standard Definition of Information
Situation logic (Israel and Perry [1990]; Devlin [1991]) provides a powerful
methodology for our task. Let us use the symbol σ and the term “infon” to
refer to discrete items of information, irrespective of their semiotic code and
physical implementation:
SDI) σ is an instance of DOS information if and only if:
SDI.1) σ consists of n data (d), for n ≥ 1;
SDI.2) the data are well-formed (wfd);
SDI.3) the wfd are meaningful (mwfd = δ).
Three comments are now in order.
6
Many other sources endorse equivalent accounts as uncontroversial, see Floridi [2003a]
for references.
IS SEMANTIC INFORMATION MEANINGFUL DATA? 353
First, SDI.1 indicates that information cannot be dataless, but it does not
specify which types of δ constitute information. Data can be of four types
(Floridi [1999]):
δ.1) primary data. These are what we ordinarily mean by, and perceive as,
the principal data stored in a database, e.g. a simple array of numbers, or the
contents of books in a library. They are the data an information management
system is generally designed to convey to the user in the first place;
δ.2) metadata. These are secondary indications about the nature of the pri-
mary data. They enable a database management system to fulfil its tasks by
describing essential properties of the primary data, e.g. location, format,
updating, availability, copyright restrictions, etc.;
δ.3) operational data. These are data regarding usage of the data themselves,
the operations of the whole data system and the system’s performance;
δ.4) derivative data. These are data that can be extracted from δ.1-δ.3, when-
ever the latter are used as sources in search of patterns, clues or inferential
evidence, e.g. for comparative and quantitative analyses (ideometry).
At first sight, the typological neutrality (TN) implicit in SDI.1 may
seem counterintuitive. A database query that returns no answer, for example,
still provides some information, if only negative information; and silence is a
meaningful act of communication, if minimalist, Grice docet, yet where are
the data in these cases? SD.1 and TN cannot be justified by arguing that
absence of data is usually uninteresting, because similar pragmatic consid-
erations are at least controversial, as shown by the previous two examples,
and in any case irrelevant, since in this context the analysis concerns only
DOS information, not interested information.7 Rather, SD.1 and TN are
justified by the following principle of data-types reduction (PDTR):
PDTR) σ consists of a non-empty set (D) of data δ; if D seems empty and σ
still seems to qualify as information, then
1. the absence of δ is only apparent because of the occurrence of some nega-
tive primary δ, so that D is not really empty; or
2. the qualification of σ as information consisting of an empty D is mislead-
ing, since what really qualifies as information is not σ itself but some
7
Interested information is a technical expression. The pragmatic theory of interested
information is crucial in Decision Theory, where a standard quantitative axiom states
that, in an ideal context and ceteris paribus, the more informative σ is to S, the more S
ought to be rationally willing to pay to find out whether σ is true [Sneed [1967]].
354 LUCIANO FLORIDI
non-primary information µ concerning σ, constituted by meaningful non-
primary data δ.2-δ.4 about σ.
So in either case there is information because there is some type of data.
Consider the two examples above. Suppose we are using the Routledge
Encyclopedia of Philosophy on CD-ROM (EREP). If the database provides
an answer, it will provide at least a negative answer, e.g. the EREP will
open a small window with the message “no search hits found”, so PDTR.1
applies: primary information is provided through explicit negative data. If the
database provides no answer, either it fails to provide any data at all (e.g. the
screen of the EREP remains unmodified), in which case no primary informa-
tion σ is available or, more likely, there is a way of monitoring or inferring
the problems encountered by the database to establish, for example, that the
EREP is not responding rather than being busy elaborating (one can try to
use the CTRL + ALT + DEL command, which will open a Window with
information about the performance of the programs currently open), in which
case PDTR.2 applies: there isn’t any primary information and the non-pri-
mary information gained is provided by some metadata. Take now the second
example. My wife’s silence could provide some primary information, e.g. a
tacit assent or denial. The datum is the silence itself, as long as it counts as a
difference (more on this in a moment). Or her silence could carry some non-
primary information µ, e.g. she has not heard my question. The fact that I do
not even know whether her silence provides some primary or non-primary
information (let alone being able to guess the specific meaning of her silence)
explains why, in any binary communication, we tend to adopt a “positive”
signal for a negative message: the computer will send me a 1 or a 0 (primary
negative information) or nothing at all (secondary negative information),
rather than a signal or nothing at all, although the latter code would still be
sufficient to communicate in ideal circumstances. This point can be further
clarified by a third example. Imagine a very boring device that can produce
only one symbol, like E. A. Poe’s raven, who can answer only “nevermore.”
This is called a unary device. The raven is the informer, we are the informee,
“nevermore” is the message, there is a coding and decoding procedure through
a language, a channel of communication, and perhaps some possible noise.
Informer and informee share the same background knowledge about the collec-
tion of usable symbols (the alphabet). Given this a priori knowledge, it is
obvious that a unary device produces zero amount of primary information.
Simplifying, we already know the outcome so our ignorance cannot be
decreased. Whatever the informational state of the system, asking appropriate
questions to the raven does not make any difference. The point that interests
us here is that a unary source like the raven answers every question all the
time with only one symbol, not with silence or symbol, since silence, if
possible, would count as a signal, i.e. as a 0. On the contrary, my wife’s
IS SEMANTIC INFORMATION MEANINGFUL DATA? 355
silence provides primary information if it is like a tacit 0, that is, only if I
assume that she might have answered something else instead. A completely
silent source is equivalent to my wife not hearing the question and qualifies
as a unary source, which can provide only non-primary information. This
shows that although there is no dataless information, the presence of data,
e.g. “nevermore”, does not guarantee the presence of primary information.
To summarise, when apparent absence of δ is not reducible to the occur-
rence of negative primary δ (the equivalent of a zero), either there is no
information or what becomes available and qualifies as information is some
further non-primary information µ about σ, constituted by some non-primary
δ.2-δ.4. Now, differences in the reduction both of the absence of positive
primary δ to the presence of negative primary δ and of σ to µ (when D is
truly empty) warrant that there can be more than one σ that may (mislead-
ingly) appear to qualify as information and be equivalent to an apparently
empty D. Not all silences are the same. However, since SDI.1 defines infor-
mation in terms of δ, without any further restriction on the typological
nature of the latter, it is sufficiently general to capture primary (positive or
negative) δ.1 and non-primary data δ.2-δ.4 as well, and hence the correspond-
ing special classes of information just introduced. As far as SDI.1 is con-
cerned, SDI is correct: there can be no dataless information.
Second comment. According to SDI.1, σ can consist of only a single
datum. Information is usually conveyed by large clusters or patterns of well-
formed, codified data, often alphanumeric, which are heavily constrained syn-
tactically and already very rich semantically. However, in its simplest form a
datum can be reduced to just a lack of uniformity, that is, a difference between
the presence and the absence of e.g. silence or of a signal:
Dd) d = (x ≠ y)
The dependence of information on the occurrence of syntactically well-formed
clusters, strings or patterns of data, and of data on the occurrence of physi-
cally implementable differences, explains why information can be decoupled
from one type of physical support in favour of another. Interpretations of this
support-independence can vary quite radically, however, because Dd leaves
underdetermined not only the logical type to which the relata belong (see
TN), but also the classification of the relata (taxonomic neutrality) and the
kind of support that the implementation of their inequality may require (onto-
logical neutrality).
Consider the taxonomic neutrality (TaxN) first. A datum is usually classi-
fied as the entity exhibiting the anomaly, often because the latter is perceptu-
ally more conspicuous or less redundant than the background conditions.
However, the relation of inequality is binary and symmetric. A white sheet of
paper is not just the necessary background condition for the occurrence of a
356 LUCIANO FLORIDI
black dot as a datum, it is a constitutive part of the datum itself, together
with the fundamental relation of inequality that couples it with the dot. Noth-
ing is a datum per se for being a datum is an external property. So SDI
endorses the following thesis:
TaxN) a datum is a relational entity.
Understood as relational entities, data are definable as constraining affor-
dances, exploitable by a system as input of adequate queries that correctly
semanticise them to produce information as output. In short, semantic infor-
mation can also be described erotetically as data + queries (Floridi [1999]).
Consider next the ontological neutrality (ON). By rejecting the possibil-
ity of dataless information, GDI endorses the following modest thesis:
ON) no information without data representation.
ON is often interpreted materialistically, as advocating the impossibility of
physically disembodied information, through the equation “representation =
physical implementation”, thus:
S.1) no information without physical implementation.
S.1 is an inevitable assumption when working on the physics of computa-
tion, since computer science must necessarily take into account the physical
properties and limits of the carriers of information.8 It is also the ontological
assumption behind the Physical Symbol System Hypothesis in AI and Cog-
nitive Science (Newell and Simon [1976]). However, ON does not specify
whether, ultimately, the occurrence of every discrete state necessarily requires
a material implementation of the data representations. Arguably, environ-
ments in which all entities, properties and processes are ultimately noetic
(e.g. Berkeley, Spinoza), or in which the material or extended universe has a
noetic or non-extended matrix as its ontological foundation (e.g. Pythagoras,
Plato, Leibniz, Fichte, Hegel), seem perfectly capable of upholding ON
without embracing S.1. The relata in Dd could be monads, for example.
Indeed, the classic realism vs. antirealism debate can be reconstructed pre-
cisely in terms of reasonably acceptable interpretations of ON.
All this explains why SDI is also consistent with two other popular slo-
gans, this time favourable to the proto-physical nature of information and
hence completely antithetic to S.1:
S.2) “It from bit. Otherwise put, every “it”every particle, every field of
force, even the space-time continuum itselfderives its function, its mean-
8
Landauer [1996]. The debate on S.1 has flourished especially in the context of quantum
computing.
IS SEMANTIC INFORMATION MEANINGFUL DATA? 357
ing, its very existence entirelyeven if in some contexts indirectlyfrom
the apparatus-elicited answers to yes-or-no questions, binary choices, bits. “It
from bit” symbolizes the idea that every item of the physical world has at
bottoma very deep bottom, in most instancesan immaterial source and
explanation; that which we call reality arises in the last analysis from the
posing of yes-no questions and the registering of equipment-evoked
responses; in short, that all things physical are information-theoretic in ori-
gin and that this is a participatory universe.” Wheeler (1990), 5.
and
S.3) “[information is] a name for the content of what is exchanged with the
outer world as we adjust to it, and make our adjustment felt upon it.” Wiener
(1954), 17. “Information is information, not matter or energy. No material-
ism which does not admit this can survive at the present day” Wiener (1961),
132.
S.2 endorses an information-theoretic, metaphysical monism: the universe’s
essential nature is digital, being fundamentally composed of information as
data instead of matter or energy, with material objects as a complex secondary
manifestation. S.2 may, but does not have to endorse a computational view
of information processes. S.3 advocates a more pluralistic approach along
similar lines. Both are compatible with SDI.
The third and final comment concerns SDI.3 and can be introduced by dis-
cussing a fourth slogan:
S.4) “In fact, what we mean by information—the elementary unit of informa-
tion—is a difference which makes a difference”. Bateson (1973), 428.
S.4 is one of the earliest and most popular formulations of SDI (see for
example Franklin [1995], 34 and Chalmers [1996], 281). A “difference” is
just a discrete state, i.e. a datum, and “making a difference” simply means
that the datum is “meaningful”, at least potentially. How data can come to
have an assigned meaning and function in a semiotic system in the first place
is one of the hardest problems in semantics. Luckily, the semanticisation of
data need not detain us here because SDI.3 only requires the δ to be provided
with a semantics already. The point in question is not how but whether data
constituting semantic information can be correctly described as being mean-
ingful independently of an informee. The genetic neutrality (GN) supported
by SDI states that:
GN) δ can have a semantics independently of any informee.
358 LUCIANO FLORIDI
Before the discovery of the Rosetta Stone, Egyptian hieroglyphics were
already regarded as information, even if their semantics was beyond the com-
prehension of any interpreter. The discovery of an interface between Greek and
Egyptian did not affect the hieroglyphics’ embedded semantics but its accessi-
bility. This is the weak, conditional-counterfactual sense in which SDI.3 can
speak of meaningful data being embedded in an information-carrier informee-
independently. GN supports the possibility of information without an
informed subject, to adapt Popper’s phrase. Meaning is not (at least not
only) in the mind of the user. GN is to be distinguished from the stronger,
realist thesis, supported for example by Dretske (1981), according to which
data could also have their own semantics independently of an intelligent pro-
ducer/informer. This is also known as environmental information, and a
typical example is supposed to be provided by the concentric rings visible in
the wood of a cut tree trunk, which may be used to estimate the age of the
plant.
To summarise, insofar as SDI provides necessary conditions for σ to
qualify as DOS information, it also endorses four types of neutrality: TN,
TaxN, ON and GN. These features represent an obvious advantage, as they
make SDI perfectly scalable to more complex cases, and hence reasonably
flexible in terms of applicability. However, by specifying that SDI.1-SDI.3
are also sufficient conditions, SDI further endorses a fifth type of alethic
neutrality (AN) which turns out to be problematic. Let us see why.
3. Alethic neutrality
According to SDI, alethic values are not embedded in, but supervene on
semantic information:
AN) meaningful and well-formed data qualify as information, no matter
whether they represent or convey a truth or a falsehood or have no alethic
value at all.
It follows that
FI) false information (including contradictions), i.e. misinformation, is a
genuine type of DOS information, not pseudo-information;
TA) tautologies qualify as information; and
TI) “it is true that σ” where σ is a variable that can be replaced by any
instance of genuine DOS information, is not a redundant expression; for
IS SEMANTIC INFORMATION MEANINGFUL DATA? 359
example, “it is true” in the conjunction “‘the earth is round’ qualifies as
information and it is true” cannot be eliminated without semantic loss.9
None of these consequences seems ultimately defensible, and their rejection
forces a revision of AN and hence of SDI. For the sake of simplicity, in the
rest of this article only the rejection of FI will be pursued, following two
strategies. The first consist in showing that none of the main reasons that
could be adduced for interpreting false information as a type of information is
convincing. This strategy is pursued in section four. The second strategy con-
sists in showing that there are compelling reasons to treat false and tautologi-
cal information as pseudo-information. This is argued in section five. Regard-
ing TA, this is commonly assumed to be false in the philosophical literature
on semantic information (see Floridi 2003c), but it is also crucially connected
to the interpretation of mathematical and analytic truths, so a satisfactory
discussion of its negation cannot be pursued here but must be left to another
paper. Further arguments against AN could also be formulated on the basis of
the literature on deflationary theories of truth and hence a criticism of TI.
These arguments are not going to be rehearsed here because the development
of this strategy, which has interesting consequences for the deflationary theo-
ries themselves, deserves an independent analysis that lies beyond the scope
of this paper. I shall return to the issue in the conclusion, but only to clarify
what may be expected from this line of reasoning.
4. Nine bad reasons to think that false information
is a type of semantic information
Linguistically, the expression “false information” is common and perfectly
acceptable. What is meant by it is often less clear, though. The American
legislation on food disparagement provides an enlightening example.
Food disparagement is legally defined in the US as the wilful or malicious
dissemination to the public, in any manner, of false information that a per-
ishable food product or commodity is not safe for human consumption.
“False information” is then defined, rather vaguely, as
“information not based on reasonable and reliable scientific inquiry, facts, or data” (Ohio
legislation, http://www.ohiocitizen.org/campaigns/pesticides/veglibel.html);
“information that is not based on verifiable fact or on reliable scientific data or evidence”
(Vermont legislation, http://www.leg.state.vt.us/docs/2000/bills/intro/h-190.htm);
9
Note that the conjunction of FI and TI presupposes two theses that are usually uncontro-
versial: (i) that information is strictly connected with, and can be discussed in terms of
alethic concepts; and (ii) that any theory of truth should treat alethic values or concepts
symmetrically.
360 LUCIANO FLORIDI
Add New Comment