A Grammar Inclusion Hypothesis of child language variation
Marina Tzakosta and Anthi Revithiadou
University of Crete and University of the Aegean
This paper examines variation in language development based on production data from three Greek-
speaking children. Variation suggests that children employ more than one grammar during the acquisition
process. This naturally raises the question of how ‘unwanted’ grammars gradually give way to the one that
relates to the adult/target grammar. To account for variation, we implement partial ordering (Anttila
1997a, b) to Tzakosta’s (2004) Multiple Parallel Grammars model of language development. More
specifically, we propose that in the intermediate state of acquisition constraint permutation of the initial
MARKEDNESS » FAITHFULNESS ranking leads to grammar explosion. We view the resulting grammars as
partial orders that contain sets of totally ranked grammars (subgrammars). The pivotal claim is that only
those subgrammars that are typologically closer to the target one will eventually survive. This is stated as
the Grammar Inclusion Hypothesis. The theoretical gain of the proposed model is that it provides a
principled basis to define developmental paths and also to distinguish between smart and non-smart paths.
The latter are partial orders that do not contain the target grammar as a total order and hence are doomed to
extinction. The former, on the other hand, are partial orders that contain at least one total order that relates
to the target grammar and, crucially, connect the running state of acquisition with the end state of language
development. Our hypothesis finds empirical support by both inter-child and intra-child language
Keywords: language acquisition, partial ordering, multiple parallel grammars, smart and non-smart paths
Research on language acquisition was originally restricted to the description of child data, the discussion of the
emergent production patterns and the formulation of linguistic generalizations in the form of phonological rules
(Smith 1973, Stampe 1973). Only in the early 90s, there was a shift of attention to the process of acquisition per
se and, specifically, to how exactly children reach the final state of the ambient language given the impoverished
status of their initial grammar (Demuth and Fee 1995, Fikkert 1994, Vihman 1996). In particular, most research
questions aimed at providing an explanation regarding the order of acquisition and, especially, the particular
paths young language learners take during the acquisition process.
Studies on language development are primarily couched within the Principles and Parameters (PP)
framework (Chomsky 1981) and Optimality Theory (OT, Prince and Smolensky 1993, McCarthy and Prince
1993a, b). According to Chomsky (1981), Universal Grammar (UG) consists of a finite set of parameters, not
yet set on specific values. Depending on the system being acquired, each child has to appropriately adjust these
parameters. More precisely, s/he has to determine the parameter setting of the target language and tune the
parameters accordingly. The question posed by Dresher (1999a, b) is how the learner knows which parts of the
input are relevant to each parameter, given that the latter are by definition abstract structures. Dresher’s (1999a,
b) and Dresher and Kaye’s (1990) answer to this question is that the learner needs to associate each parameter
with specific triggers or cues. That is, the learner must always be in search of positive evidence (see also Lasnik
and Uriagereka 2002, Legate and Yang 2002, Pulleyblank and Turkel 1998, Sampson 2002). For instance, in a
language where CVC syllables are permitted, the learner needs to be exposed to such structures in order to set
the syllable coda parameter (i.e. syllables must have codas) to the value ‘on’. If exposure to such structures does
not take place, it is likely that the learner will acquire the wrong grammar, i.e. one that lacks CVC syllables.
Such a grammar potentially corresponds to an actual grammar of some language but, crucially, not to the
grammar of the specific language being acquired. In other words, parameters must be intrinsically ordered as the
learner progressively moves from less complex structures to more complex ones. Fikkert (1994) argued that this
is exactly the case in the acquisition of syllable structure and stress in Dutch. On the basis of production data
from twelve children, she shows that, at the beginning, children set some parameters so that they produce simple
words, in terms of the number and the complexity of syllables. As the acquisition process proceeds, however,
they gradually re-set the parameters towards producing more complex syllable structures. Language
development, therefore, is considered to be a gradual parameter (re-)setting process. It should be emphasized
that in the PP framework the child acquires one grammar, which, at each stage of language acquisition, becomes
more elaborate – via parameter re-setting – and thus typologically closer to the adult grammar.
Tesar and Smolensky (1993, 1998a, b, 2000) advance a similar idea which is, however, cast within the
constraint-based framework of OT. More specifically, they propose a learning algorithm which is based on
constraint demotion.1 To explain, a grammar is considered to be a set of universal and violable constraints that
are ranked with respect to each other. Languages differ in how they prioritize and appropriately order these
constraints. Consequently, acquiring a language is tantamount to figuring out the particular ranking of the target
language.2 Tesar and Smolensky provide solid argumentation that children start from the simplest possible
grammar, namely one where all markedness constraints outrank faithfulness constraints. In the course of
acquisition, they refine and improve this grammar by gradually demoting markedness constraints, as dictated by
the data they are exposed to. The process of demotion is completed only when they reach a certain stage where
the constraint ordering of their grammar is identical to the one of the adult grammar.
The above proposals as well as other models of language development (Smith 1973, Drachman 1973a, b)
assume a linear model of acquisition in the sense that the child’s language system progresses step-by-step and in
a homogeneous fashion without regressions to earlier developmental states and without showing any variation in
production. In fact, such proposals do not leave much leeway to variation outside transitional stages, i.e., turning
points for parameter re-setting. However, Revithiadou and Tzakosta (2004a, b) and Tzakosta (2004), based on
longitudinal production data from Greek, observed extensive inter- and intra-child variation which is, crucially,
not restricted to transitional developmental stages only, but rather spreads throughout the language acquisition
process (cf. Drachman 1975, for relevant discussion). On the basis of these findings, they propose that each
child makes use of a set of multiple parallel grammars (MPG), which are responsible for the rising of variable
output productions for a given input string during the same developmental phase.
According to the MPG, elaborately developed in Tzakosta (2004), learning is completed in three
developmental phases: the initial phase, the intermediate phase and the final phase. The shape of the grammar in
polar phases is quite firm. More specifically, in the initial state, markedness constraints outrank faithfulness
constraints and, consequently, unmarked structures prevail in child speech. In the final state, on the other hand,
faithfulness constraints occupy a rank in the constraint hierarchy yielding outputs which are typologically closer
to the target forms. Multiple grammars emerge during the intermediate phase as a result of constraint
permutation of the initial MARKEDNESS » FAITHFULNESS grammar. Such grammars are activated in parallel but
become weaker and, are eventually, abandoned, as acquisition proceeds and children reach the target grammar.
Significantly, the grammars, which are typologically more distant to the target grammar, are abandoned first,
whereas grammars with a stronger typological connection to the target grammar are more resistant to
Extending this non-linear model of acquisition, Kateri, Revithiadou and Varlokosta (2005) show that the
grammars each child employs at a certain developmental phase are interconnected in the sense that one may
entail another. Because only a few and not all of these grammars are reinforced by positive evidence, the child
maintains a web of grammars that progressively will lead her/him to the adult grammar. The main goal of this
paper is to explore the nature of such developmental paths3 and, moreover, to explain in a principled way how
the child is driven from a set of multiple grammars to the target one. For this purpose, we propose a restricted
version of the MPG that relies heavily on the Grammar Inclusion Hypothesis (‘grammars that include other
grammars’) first proposed by Kateri, Revithiadou and Varlokosta (2005) and further elaborated here. More
specifically, we exploit the inherent property of Optimality-Theoretic grammars, namely partial ordering,
proposed by Anttila (1997a, b et seq), and put forward the claim that the set of possible grammars employed by
a child is the sum of total and/or partial orders of the relevant markedness and faithfulness constraints.
Furthermore, we claim that the developmental path a child chooses to follow is simply a subset of these partial
orders. Interestingly, this approach allows us to draw a distinction between smart and non-smart developmental
paths. The former lead smoothly to the target grammar because they consist of grammars that include the target
grammar. The latter, on the other hand, are paths that do not contain a grammar which exhibits typological links
with the target one, and, due to lack of positive reinforcement, they are inevitably driven to extinction.
The remainder of this paper is organized as follows: In section 2, we describe the research methodology
followed in data collection and encoding. In section 3, we introduce Anttila’s (1997a, b, 2002 et seq.) theory of
variation. We discuss the implementation of this model in acquisition in section 4 and continue in section 5 with
exploring its ramifications for phonological development. In particular, we draw a distinction between smart and
non-smart developmental paths, as these are revealed by inter- and intra-child variation data from Greek. In
section 6, we conclude this paper.
The present survey is based on Tzakosta’s (2004) Greek L1 acquisition database. In this paper, however, we
confine our discussion and results to a group of three children, two girls and one boy – Ioanna (Io), Marilia (Ma)
and Bebis Metaxaki (BM). The children recruited for this study were raised in monolingual environments and
range in age from 1;9.22 to 3;05.23 years. The observations and generalizations drawn in this paper have
broader empirical coverage; however, limitations on space refrain us from including datasets from other children
in our discussion.
Our survey relies both on controlled and spontaneous natural speech data. The former were elicited with
the help of a semi-structured technique of picture naming, whereas the latter were collected through free
interaction of the interviewer with the child. The data were recorded by a trained linguist who visited the
children on a weekly basis in 25-45 minute-long sessions over a period of two years, depending on each child’s
recording period. An analogical recorder and a microphone were used for the recordings. The recorded speech
samples were transcribed into IPA by two trained phoneticians and one of the authors, all native speakers of
Greek. In the few cases that the transcribers could not reach full agreement on what was produced by the child,
the data were omitted. In general, the transcription process was consistent with the criteria of reliability posited
in Bennett-Kastor (1988). The data were organized in an Access Database (Leiden University/ULCL).
3. Anttila’s Partial Ordering Theory of variation
As mentioned in the introductory part of this paper, young learners of Greek display variation in their
productions. In this section, we outline the basics of an OT-based theory of variation that will serve as the
theoretical basis for the definition of developmental paths.
Anttila (1997a, b, 2002 et seq.) exploits an intrinsic property of OT, namely partial ordering (PO), to
account for variation in Finnish phonology and morphology. According to this theory, variation is the result of
partial orders. More specifically, in a total order, every constraint is fully ranked with respect to every other
constraint; in a partial order, on the other hand, the ranking remains incomplete. The abstract example in (1),
adopted from Anttila (1997b: 24-26), helps us understand the distinction between total and partial orders.
Given the constraints A, B and C in (1a), the grammar A » B » C is a total order because every constraint
occupies a position in a separate stratum. This grammar yields a single output for a given input form. Thus, it
predicts no variation. By removing one of the rankings, for example B » C, we obtain the partial order A » B, C
which corresponds to the rankings given in (1b). Each of the unordered constraints is ranked below constraint A
but, crucially, they are not ranked with respect to one another. The grammar in (1c) is thus a partially ordered
(1) a partially ordered grammar
Constraints: A, B, C
Rankings: A » B, A » C
Grammar: A » B, C
Crucially, in (1c), the absence of ranking between B and C entails that, given one input form, the grammar
generates two possible outputs and, consequently, predicts language variation. More specifically, it corresponds
to the two totally ordered tableaux given in (2). Tableaux 1 and 2 illustrate that the crucial ranking between
constraints B and C provide different winning forms. More specifically, ranking B » C favors cand2, whereas
ranking C » B appoints cand1 as the winner.
(2) totally ordered tableaux
Let us examine how this model of variation accounts for variable pairs such as "θelun ~ "θelune ‘want-
PRES.3PL’ in Greek. The former output ends in a closed syllable suggesting that MAX-SEG and DEP-SEG are both
ranked above the NOCODA constraint. The latter output, however, implies that a different ranking is also at play,
one in which DEP-SEG is outranked by NOCODA. In the PO model, the observed pattern of variation is a
reflection of the partial ordered grammar in (3a), which corresponds to two total orders in (3b-c):
MAX-SEG » DEP-SEG, NOCODA
MAX-SEG » DEP-SEG » NOCODA
C. MAX-SEG » NOCODA » DEP-SEG
This partial order corresponds to two tableaux which appoint different candidates as winners: "θelun is the
winner in T3 and "θelune is the winner in T4. Since the described partial grammar permits two rankings, it also
permits two outputs and hence variation.
Subsequently, if a grammar is defined as a total order, in the case of variation we deal with multiple
grammars (Anttila 1997b:29).4 A partial grammar, in other words, consists of a set of total orders, e.g. MAX-SEG
» DEP-SEG » NOCODA and MAX-SEG » NOCODA » DEP-SEG, each of which constitutes what is called here a
subgrammar (cf. Waterson 1971, for relevant discussion on subgrammars in language acquisition). Addition of
more rankings in a partially ordered grammar means that we generate proper subsets of this grammar. Anttila
(2002:21) explains that “the resulting partial orders will each be increasingly specific and contain fewer and
fewer total orders. The most specific partial order is one where every constraint is ranked with respect to every
other constraint, which equals a single total order.” Thus, by adding rankings between the constraints, we
proceed to more specific grammars and hence to a complete disappearance of variation. In the next section, we
move on to presenting an implementation of partial ordering to the MPG model of L1 acquisition.
4. Variation in L1 acquisition
4.1. Variable stress outputs
The diversity in the manifestation of stress in Greek will serve as a representative example of variation from L1
acquisition. It is well-established that stress can appear on any of the last three syllables of the word (Malikouti-
Drachman and Drachman 1989, Drachman and Malikouti-Drachman 1999, Revithiadou 1999). Our focus here
is on the so-called metrically ambiguous words, that is, trisyllabic or longer words which carry stress on a non-
peripheral syllable. As shown in (4), for an input of the stress pattern W1SW2,5 Marilia produces three different
outputs: (a) W1S (4a-b), (b) SW2 (4c-d) and (c) the faithful W1SW2 (4e-f).
Such observed patterns of variation led Tzakosta (2004) to the conclusion that children employ more than
one grammar in parallel. These grammars are developed during the grammar explosion stage as the result of
constraint permutation – via demotion – of the initial M » F ranking. To explain, in the transition from an initial
state to a more advanced one, different markedness constraints are demoted in parallel, thus generating a large
pool of possible grammars. To give an abstract example, permutation of just eleven constraints generates
39.916.800 grammars! This is an amazingly large number of grammars for a child to develop. More
importantly, it is impossible for all of them to be empirically motivated. However, Tzakosta (2004) has shown
that variation is not unconstrained, since in reality children make use of a remarkably confined set of grammars.
The interested reader is referred to Tzakosta (2004) for the principles and constraints that hamper down
grammar explosion. In this paper, our focus is on exploring how exactly the child proceeds from a confined set
of grammars to the one that corresponds to the target language. We illustrate this with inter-and intra-child data
from the acquisition of stress.
4.2. Inter-child variation
The young learners of our study group make use of several grammars for stress. Here, we provide robust
versions of the grammars activated by these children, although in reality they may employ more refined versions
of these grammars. The following five grammars are observed: the FAITHFULNESS Grammar, the TRUNCATION
Grammar, the REVERSE RHYTHM grammar, the WEAK SYLLABLE grammar and the CONSTRUCTED SYLLABLE
The FAITHFULNESS6 Grammar (FAITH-G) corresponds to the target grammar which yields faithful to the
adult language outputs. As illustrated by the examples in (5), the metrical and prosodic shape of the adult forms
are faithfully produced by the children.
The TRUNCATION Grammar (TRUNC-G), on the other hand, is responsible for the manifestation of
truncated, i.e. syncopated, outputs in child speech. The data in (6) demonstrate that, regardless of the stress
pattern of the input word, child productions are maximally two syllables long. Interestingly, the output template
is either trochaic or iambic depending on whether the stressed syllable groups with the preceding or with the
following syllable of the input word.
Furthermore, the REVERSE RHYTHM Grammar (REVRHYTHM-G) strives towards unfaithful realizations of
the metrical pattern of input forms. For example, disyllabic words of trochaic shape are realized as iambs (7a-c)
and vice versa (7d-e). It is worth pointing out that in the speech of many children in Tzakosta’s (2004) database
words of the pattern W1SW2 are produced with reverse rhythm, namely SW1W2, e.g. /vu."va.li/ → ["bu.ba.li]
‘buffalo-NOM.SG’ (Melitini 1;7.14). This grammar is more likely to arise in the acquisition of Greek stress where
trochaic forms co-exist with iambic as well as metrically ambiguous ones than in more transparently trochaic
systems such as English or Dutch.
The WEAK SYLLABLE Grammar (WEAKSYLL-G) and the CONSTRUCTED SYLLABLE Grammar
(CONSTRSYLL-G) are more marginal than the others. The former strives towards preserving weak (unstressed)
syllables, preferably those that lie at one of the word edges. Some representative examples are given in (8). As is
obvious from examples such as (8d), for instance, markedness restrictions often apply to simplify the segmental
composition of the produced form. Stress is predominantly trochaic in these productions.
The CONSTRSYLL-G averts the emergence of complex structures and yields mainly monosyllabic outputs
that consist of the most unmarked (i.e. underspecified) segments of the input form. To explain, such syllables
are composed of the less marked segments in the word, but not necessarily the ones that occupy the head
position. Some illustrative examples are provided in (9), where coronals, for example, are favored to velars and
stops are chosen over fricatives. In (9a), for instance, the syllable with the coronal voiceless stop /to…/ surfaces in
preference to the stressed one or, even, the other syllables in the word that contain a fricative or a sonorant. The
stressed syllable is chosen to be pronounced only when it has a relatively unmarked segmental make-up
compared to the remaining syllables of the word, as shown by examples such as (9b).
Table 1 presents all attested grammars in the speech of the three children under examination and their
respective outputs for SW, WS, SW1W2, W1SW2 and W1W2S7 input forms.
Table 1. Set of multiple grammars
SW WS SW1W2
TRUNC-G S S SW
WS, WWS, WSW
W W W1,W2, W1W2
The grammars listed in Table 1 can be either total or partial orders. Table 2 lays out the rankings for each
grammar. However, before taking a closer look at these rankings, it is necessary to introduce the relevant
constraints first. As mentioned above, in this paper we present robust grammars, therefore archetypical versions
of the relevant constraints are employed. The constraint Fσ! strives for faithfulness of the stressed syllable,
whereas Fsize is a faithfulness constraint that requires preservation of the original size of the target word. On the
contrary, T stands for constraints that favor truncation and, in general, produce reduced versions of the input
word. T is satisfied when sizable chunks of the input word are trimmed off. W is an abbreviated form of a
markedness constraint that compels preservation of some weak syllable. M yields the preservation of unmarked
segments in the produced forms. ‘Unmarkedness’ in this case primarily involves the selection of consonants
with lacking positive featural values over other consonants. M is active not only in the CONSTRSYLL-G, but also
in all grammars that favor the production of unmarked forms. Crucially, however, in the CONSTRSYLL-G, it is
highly ranked. Finally, the constraint ¬R is a rhythmic constraint that reverses the rhythm-type of an input word
in the output. It is satisfied when output rhythm does not match input rhythm, that is, when an input /SW/ is
realized as [WS] in the output and vice versa.10
It is evident from Table 2 that only the FAITH-G, which yields outputs identical to the target language, is a
total order.11 All other grammars are partial orders and hence yield variable outputs, the number of which is
relative to the number of rankings permitted in each partially-ordered grammar. For example, the TRUNC-G is a
partial order which consists of six total orders, i.e. it corresponds to six tableaux. Each tableau derives a unique
winner. It is possible that the winners of these tableaux converge. Thus, the system predicts at least two and at
most six possible outputs for the TRUNC-G.
Table 2. Totally and partially ordered grammars
Fσ! Fsi » T W
T, Fσ!!, W » Fsi
T, W » Fσ!!, Fsi
¬R, Fsi » Fσ!!, T
T, M » Fσ!, Fsi
Interestingly, we found out that in our case studies there is a subset relation among the sets of grammars of
the children examined. As illustrated in Table 3, BM’s grammar inventory is a proper subset of Ioanna’s
grammar inventory. More specifically, BM lacks the CONSTRSYLL-G that Ioanna has.
Table 3. Cross-child distribution of grammars
This observation extends beyond the specific pair of learners and, in general, characterizes the distribution
of grammars across all children in Tzakosta’s database: some grammars appear steadily in all children’s speech
whereas others do not. For instance, FAITH-G and TRUNC-G are among the most widespread ones. In contrast,
the REVRHYTHM-G and the CONSTRSYLL-G are less common (Tzakosta 2004). In fact, they are attested
primarily in Ioanna’s speech. This is anticipated given that Ioanna was recorded for the longest period of time.
Therefore, it is more likely that her speech displays a wider range of grammars including also those that are
typologically more distant from the target grammar. The question that naturally arises at this point is why some
grammars appear to have a broader empirical coverage than others. We know from previous studies that
typologically simple and less marked grammars are favored by children (Gnanadesikan 2004). However, under
this assumption, it comes as a surprise that that the FAITH-G is also among the preferred grammars of the young
learners of Greek. In order to answer this question, we must, first, have a careful look at the grammars at hand
and, then, explore the nature of the rankings they consist of.
Table 4 demonstrates that TRUNC-G is a partial order which consists of six total orders, two of which
contain some aspect of the target grammar, namely Fσ!! » T » W » Fsize and Fσ!! » W » T » Fsize. In particular, in
subgrammar-3 and subgrammar-4, Fσ! is ranked above T and W, respectively. Crucially, this ranking partly
identifies with the target grammar. In other words, TRUNC-G subsumes some version of the target grammar, in
the sense that the Fσ!!, which requires the preservation of the stressed syllable, is ranked in the highest stratum of
the constraint hierarchy.
Table 4. TRUNC-GRAMMAR: rankings and subgrammars
T, Fσ!!, W » Fsize
T » Fσ!! » W » Fsize
» W » Fσ!! » Fsize
Target Grammar: Fσ
Fsi » T W
» T » W » Fsize
Fσ!! » W » T » Fsize
W » Fσ!! » T » Fsize
W » T » Fσ!! » Fsize
Unlike the TRUNC-G, the CONSTRSYLL-G is a partial order which consists of four total orders which
correspond to the four tableaux shown in Table 5. Strikingly, none of these orders identifies with the target
grammar. This is because in all subgrammars both faithfulness constraints occupy a low rank in the constraint
Table 5. CONSTRSYLL-GRAMMAR: rankings and subgrammars
T, M » Fσ!, Fsize
T » M » Fσ! » Fsize
T » M » Fsize » Fσ!
Target Grammar: Fσ!! Fsi » T M
M » T » Fσ! » Fsize
M » T » Fsize » Fσ!
Let us take a closer look at two representative total orders of the CONSTRSYLL-G. Tableau 5 exemplifies
subgrammar-1 and tableau 6 exemplifies subgrammar-3. In both subgrammars, T or M occupies the highest
stratum whereas Fσ! and Fsize occupy the lowest one. As a result, only truncated forms that crucially do not
contain the stressed syllable qualify as optimal outputs. Under no ranking, the stressed syllable has the chance to
emerge, unless it so happens that it consists of the most unmarked segment in the word, e.g. /"vle.po/ → ["lep].
T M Fσ!
** !** ***
M T Fσ!
Turning now to the question concerning the somewhat unexpected predominance of FAITH-G and, to some
extent the TRUNC-G, as opposed to the marginality of the WEAKSYLL-G, the REVRHYTHM-G and the
CONSTRSYLL-G, we argue that this can receive a straightforward explanation in a model that views parallel
grammars as partial orders. A comparative look among these parallel grammars reveals that partially ordered
grammars that subsume total orders that contain the target grammar are more likely not only to emerge early in
acquisition but also to adopt a slower pace of dying out than other grammars. We argue that this is because they
share strong typological links with the target grammar. The results of inter-child variation are further supported
by our findings regarding intra-child variation to which the next subsection focused on.
4.3. Intra-child variation
Table 6 shows the distribution of grammars in the language development of a single child. Ioanna will serve as a
case study. Ioanna’s grammar explosion state is divided into two periods on the basis of the grammars used.
Interestingly, she uses only a subset of the original grammars in the second period of her language development.
More specifically, in the transition from the first to the second period she abandons the CONSTRSYLL-G and the
REVRHYTHM-G. It should be noted that the abandoned grammars do not subsume a single total ranking that
corresponds to the adult/target language and, consequently, are of the type that cannot guarantee a smooth
transition to the target grammar.
Table 6. Intra-child distribution of grammars
Ioanna (period 1)
Ioanna (period 2)
5. The Grammar Inclusion Hypothesis
The examination of inter- and intra-child variation data revealed that partial orders which include the target
grammar are attested in all children’s speech, whereas partial orders that do not contain even a single total order
that relates to the target grammar are highly disfavored. The Grammar Inclusion Hypothesis stated in (10)
provides a viable explanation to this disparity. A partial grammar of the former type constitutes a smart
developmental path because it includes the target-language ranking and, eventually, is designed to lead to it. A
partial grammar of the latter type, however, does not contain the target-language grammar and hence constitutes
a non-smart developmental path. Undeniably, such a path delays the acquisition process because it is not
typologically related to the target grammar. A non-smart path fails to connect the running state of acquisition
with the end state of language development and hence it slowly but surely loses ground.
Grammar Inclusion Hypothesis
Advanced grammars are partial orders which subsume total orders that are proper subsets of
early grammars. A smart developmental path is one that subsumes subset total orders that
contain the target (more faithful) grammar. (Kateri, Revithiadou and Varlokosta 2005: 33)
The following abstract example helps us visualize the described system of relations. Any partially ordered
grammar (Grammar-1) forms a possible developmental path since it subsumes more than one total
order/subgrammar (Subgrammar-2 and Subgrammar-3). A developmental path is smart, if and only if it contains
at least one total order that reflects the target grammar; otherwise the described web of (sub)grammars forms a
non-smart path. The path depicted in (11) qualifies as smart because the partial order A » B, C contains
Subgrammar-2 which corresponds to the target grammar.
Grammar-1: A » B, C (partial order)
Subgrammar-2 Subgrammar-3 (total orders) developmental path
A » B » C
A » C » B
where A » B » C is the target grammar
∴ G-1 ⊃ G-2, G-3 is a developmental path; it is smart iff SubG-2 or SubG-3 ≅ target grammar
Given that smart paths share some typological properties with the target grammar, it becomes clear why
they are preferred by all children: they smoothly lead to the target grammar without the chance of obstructing
phonological acquisition. Thus, smart paths lower the expectation for regression to earlier developmental stages.
The question, however, that arises from the discussion so far is why ‘non-smart’ paths are adopted by children
to begin with. The answer to this question lies on the nature of the subgrammars that comprise ‘non-smart’
paths. Such subgrammars are governed by UG principles such as sonority, ideal prosodic size (e.g. word
binarity), and so on. In general, they reflect the unmarked properties of human language. Hence, they are
expected to emerge during the initial and, naturally, the intermediate state of L1 acquisition as well. Moreover,
non-smart paths may constitute real grammars or possible subgrammars of other languages of the world.
Consequently, given the unifying character of UG, the grammars of some languages may be potential
developmental grammars in other languages. Input frequency in the target language relates the actual data with
these potential grammars and renders the latter smart or non-smart developmental paths. Extending this line of
thought, we also predict grammars not typologically related to the target grammar as well as grammars not
governed by the UG principles to form impossible paths and hence never be used by children.
One may wonder how exactly non-smart paths get out of the picture. Do they succumb to the leveling
forces of the other grammars? The most plausible scenario is that non-smart paths die out due to their poor
typological relevance with the target grammar. Moreover, input frequency effects do not reinforce their
maintenance either. In other words, non-smart paths are disposed of as soon as children realize, on the one hand,
their minuscule typological identity or even profound disparity with the target grammar and, on the other hand,
the absence of empirical evidence in their support. Our prediction is that the more non-smart paths are activated,
the slower the process of acquisition is.
6. Summary and conclusions
In order to account for the patterns of variation observed in the speech of children acquiring Greek L1, we
proposed a grammatical model that relies on the MPG theory of acquisition (Revithiadou and Tzakosta 2004a,b,
Tzakosta 2004) and, at the same time, incorporates the basic insights of Anttila’s (1997a, b, 2002) theory of
phonological and morphological variation and, specifically, partial ordering.
The implementation of partial order to language acquisition has several advantages. First, it allows us to
account for variation in child speech in terms of a grammar with incomplete constraint rankings. Subgrammars
are taken to be not accidental constellations of constraint hierarchies but rather the sum of total orders a partially
ordered grammar consists of. To put it simply, the less the missing rankings in a grammar, the fewer the total
orders it subsumes and hence the subgrammars. Second, partial ordering offers a principled basis, i.e. the
Grammar Inclusion Hypothesis, for the definition of developmental paths. More specifically, it succeeds in
drawing a distinction between smart paths that gradually lead to the target grammar and non-smart paths that
possibly trigger regression and, inevitably, impede the acquisition process.
In the future, there is a need to further explore and test the predictive and explanatory power of the
Grammar Inclusion Hypothesis. More specifically, the investigation should focus first, on the theoretical and
typological distinctions between paths that qualify as smart or not from the outset of phonological development,
and, second, on how non-smart paths emerge and eventually decline.
We wish to thank an anonymous reviewer, the Editors of the JGL as well as Michalis Georgiafentis and the
audience of ICLaVe 3 (Meertens Institute, Amsterdam 23-25 June 2005) for their insightful comments and
valuable suggestions. All errors are of course our own.
1 For different versions of constraint demotion algorithms, the interested reader is referred to Tesar and Smolensky (2000).
2 According to Tesar and Smolensky (2000), the learner does not get information about the target-language ranking based on
positive evidence only. Every piece of data brings with it a body of implicit negative evidence. More specifically, having
being exposed to a positive datum p, the learner knows that, with respect to the unknown constraint hierarchy of the
language being learned, the alternative parse of the same input p’ is less harmonic (Tesar and Smolensky 2000:30-43).
Therefore, each piece of positive initial data conveys a large amount of inferred information.
3 For a different definition of developmental paths, the interested reader is referred to Levelt and Van de Vijver (2004).
4 See Kiparky (1993) for a view of variation in terms of competing grammars.
5 W stands for weak syllables and S for strong (stressed) syllables.
6 Faithfulness here refers to prosodic form and not to segmental content and structure.
7 The notation S, W, W1, W2 refers to syllable types which, production-wise, remain intact, whereas CV stands for the form
which is produced with unmarked segments.
8 The total rankings for each grammar are provided in Appendix A. The statistics for the FAITH-G for each child are given in
9 The outputs of the CONSTRSYLL-G may vary in size depending on the exact rank of the constraints that promote truncation
in the system. In their vast majority, these outputs are monosyllabic and disyllabic; longer ones cannot be excluded but they
are practically unattested in our speech sample. This is expected given the high rank of T constraints in the grammar.
10 In a language like Greek, where both SW and WS stress patterns emerge on the surface, this constraint is technically an
encapsulation of different rankings of the foot-type constraints IAMB and TROCHEE.
11 Although there is no ranking relation between Fσ! and Fsize as well as T and W, we take this grammar to be a total order
because the constraints in each respective pair cannot contrast with each other because they belong to the same stratum. This
is indicated with the absence of commas between the constraints.
Anttila, Arto. 1997a. “Deriving variation from grammar”. Variation, Change and Phonological Theory ed. by
Frans Hinskens, Roeland Van Hout and W. Leo Wetzels, 35–68. Amsterdam/Philadelphia: John Benjamins.
Anttila, Arto. 1997b. Variation in Finnish Phonology and Morphology. Ph.D. dissertation, Stanford University.
Anttila, Arto. 2002. “Morphologically conditioned phonological alternations.” Natural Language and Linguistic
Bennett–Kastor, Tina. 1988. Analyzing Children’s Language: Methods and Theories. Oxford and New York:
Chomsky, Noam 1981. “Principles and parameters in syntactic theory”. Explanation in Linguistics ed. by
Norbert Hornstein and David Lightfoot, 32–75. London: Longman.
Demuth, Katherine and E. Jane Fee. 1995. “Minimal words in early phonological development”. Ms., Brown
University, Providence, Rhode Island, and Dalhousie University, Halifax, Nova Scotia.
Drachman, Gaberell. 1973a. “Generative Phonology and Child Language Acquisition”. The Ohio State
University Working Papers in Linguistics no. 15: Generative Phonology and Child Language Acquisition
ed. by Angeliki Malikouti–Drachman, Gaberell Drachman, Mary Louise Edwards, Jonnie E. Geis and
Lawrence C. Schourup, 146–160. Columbus, Ohio: The Ohio State University.