Converting the Penn Treebank to Systemic Functional Grammar
Department of Linguistics, Macquarie University
language processing (Munro, 2003; Couchman and
Whitelaw, 2003), and there is a strong history of
Systemic functional linguistics offers a grammar
interaction between systemic functional linguistics
that is semantically organised, so that salient gram-
and natural language generation (Matthiessen and
matical choices are made explicit. This paper de-
Bateman, 1991). However, there is currently a lack
scribes the explication of these choices through the
of computational SFG resources. There is no stan-
conversion of the Penn Treebank into a systemic
dard format for machine readable annotation, no an-
functional grammar corpus. Developing such a re-
notated corpora, and no useable parsers. Converting
source can help connect work in natural language
the Penn Treebank will make a large body of SFG
processing to a signiﬁcant body of research dealing
annotated data available to computational linguists
explicitly with the issue of how lexical and gram-
for the ﬁrst time, an important step towards address-
matical selections create meaning.
ing this situation.
We ﬁrst discuss some preliminaries relating to
the nature of systemic functional grammar, and the
The Penn Treebank was designed to maximise con-
scope of the converted corpus’s annotation.
sistency and annotator efﬁciency, rather than con-
then discuss the conversion of the treebank’s phrase-
formity with any particular linguistic theory (Mar-
structure representation to SFG constituency struc-
cus et al., 1994). This results in trees that strongly
ture, and ﬁnally we discuss the addition of interper-
suggest the use of synthetic features to explicate
sonal and textual function structures.
semantically signiﬁcant grammatical choices like
mood, tense, voice or negation. These distinctions
lie latent in the conﬁguration of the tree in the Tree-
bank II annotation scheme, making it difﬁcult for a
Structure of the SFG analysis
machine learner to make use of them.
Systemic functional grammar divides the task of
Rather than the ad hoc addition of this informa-
grammatical analysis — the process of stating the
tion at the feature extraction stage, the corpus can be
grammatical properties of a text — into two parts:
re-presented in a way that makes feature extraction
analysis of syntactic structures, and analysis of
more principled. This involves increasing the size
and complexity of the representation of a sentence
SFG syntactic analysis is constituency based,
by organising the tree semantically. Organising a
and is predicated on Halliday’s notion of the rank
grammar semantically is by no means a trivial task,
scale (Halliday, 1966): clauses are composed of
and has been an active area of linguistic research for
groups/phrases, which are composed of words,
the last forty years. This paper describes the con-
which are composed of morphemes.
version of the Penn Treebank into a prominent out-
concerns of SFG syntactic analysis are the chunk-
put of such research, systemic functional grammar
ing of words into groups/phrases, and the chunk-
ing of groups/phrases into clauses. Levels of con-
Systemic functional grammar does not conﬁne its
stituency between groups/phrases and their words
description to syntactic structure, but includes a rep-
are recognised in the literature (Matthiessen, 1995),
resentation of the choices grammatical conﬁgura-
but rarely brought into focus in research unless
tions represent — or ‘realise’, to use the term pre-
the group/phrase contains, or is, an embedded con-
ferred in the linguistics literature (Halliday, 1976).
stituent from another rank (e.g., a nominal group
There is growing evidence that systemic func-
like ‘the man’ with an embedded relative clause like
tional grammar can be usefully applied to natural
‘who knew too much’).
Function structures can refer to any rank of the
The distinction between systems which can be
constituency, but clause rank functional analysis
automatically annotated and systems which cannot
is generally regarded as the most important. The
lies in the way the systems are realised.
grammar deﬁnes a set of systems, which can be de-
and theme are realised primarily through the order
ﬁned recursively using conjunction and disjunction.
of constituents (the order of Subject and Finite in
They are usually represented graphically in system
the case of mood, and the ﬁrst Adjunct, Subject,
networks (Matthiessen, 1995), as in Figure 1.
Complement or Predicator in the case of theme).
In this ﬁgure, the nested disjunction ‘indicative
They are realised structurally, as opposed to lexi-
or interrogative’ represents a more delicate, or ﬁner
cally. Other systems are realised through the se-
grained, distinction than that between indicative and
lection of grammatical items (also called ‘function
imperative. After selecting from the initial choice,
words’ — a term we prefer not to use because of the
one proceeds from left to right into increasingly del-
special sense of ‘function’ in the context of SFG).
icate distinctions. These systems are categorised
Systems that are realised with grammatical items,
into three metafunctions, which represent differ-
such as voice, polarity and tense, can also be au-
ent types of meaning language enacts simultane-
tomatically annotated. Lexically realised systems,
ously (ideational, interpersonal and textual) (Hall-
on the other hand, require a lexicon or equivalent
resource, since the choice of words within identi-
cal syntactic structures changes the selection from
the system. Trees which are identical at every level
except their leaves have different process type se-
lections. The central system of transitivity, process
type, cannot be analysed for this reason.
The annotation of the corpus we present there-
fore attempts to include selections from the follow-
Figure 1: A simple mood system, ‘(indicative or in-
ing systems at clause rank:
terrogative) or imperative’
Scope of target annotation
– mood (i.e.
mood type and role tags
There is no clearly deﬁned limit to systemic func-
for Subject, Finite, Predicator, Adjunct,
tional grammar, in the sense that one could say that a
text has been ‘fully’ analysed. The grammar is con-
– clause class
stantly being extended, with new kinds of analysis
and levels of delicacy suggested. The ultimate aim
of the approach is to distinguish every semantically
distinct different wording choice (Hasan, 1987).
When working with systemic functional gram-
mar, then, practitioners generally deﬁne the scope
of their analysis. We must do the same, although
– theme (i.e. role tags for Textual Theme,
the reasons are different. Analysis, so far, has al-
Interpersonal Theme, Topical Theme,
ways been performed manually, with only ﬁnite
time available. Projects have therefore had to de-
cide between the size of a sample and the detail of
its analysis. In our case, we are limited to the kinds
of analysis which can be directly inferred from
Ideational analysis is omitted entirely, because
the Penn Treebank. Future research will doubtless
transitivity analysis requires a more complicated
leverage other resources to extend the analysis of
approach, as discussed above. Although arguably
the corpus we present, but attempts to do so are be-
some aspects of taxis and expansion type could be
yond the scope of this paper.
annotated automatically, because the central infor-
The Penn Treebank presents accurate con-
mation cannot be annotated, we have left it out en-
stituency and part-of-speech information. This is
enough information to annotate the corpus automat-
ically with roughly two thirds of the most important
clause rank systems: mood and theme, but not tran-
We have not found it necessary to use a method of
automatic rule induction to generate a CFG. The
lack of a suitable training set made that approach
impractical for the time and resources we have had
available; and good results have been obtained by
simply using a set of hard-coded transformation
functions, implemented as a Python script. This
approach does have a signiﬁcant drawback, how-
ever: because the script does not output a con-
Figure 2: Raising of NP and PP nodes dominated
version grammar, correcting systematic errors and
by a VP
other maintenance or extension tasks are much more
The ﬁrst process in the conversion of a sentence
is to parse the Lisp-style string representation into a
tree of generic node objects. Each node contains
a function tag (which may be null), a node label
and a set of children (which may be empty). The
root node is then used to initialise a sentence ob-
ject, which sorts its immediate children into clause,
group and verbal group objects. As each class is
initialised, it initialises a clause, verbal group, other
group or lexis object with each of its children. The
A Lorilard spokeswoman said
an old story
tree is thus recursively re-represented by more spe-
ciﬁc constituent objects, rather than generic node
Figure 3: A clause dominating another
objects. Subtyping the nodes facilitates the changes
to the structure that must be performed, since the
ture. All non-nominalised, non-embedded clauses
structural changes are mostly speciﬁc to either ver-
are therefore siblings dominated by the root clause
bal groups or clauses.
These changes are divided into a series of steps,
Figure 3 shows the Treebank representation, with
each coded as a function. Each function contains
a hypotactic clause as a child of a VP. Hypotactic
a series of conditionals which identify the struc-
clauses are raised to be siblings of the nearest clause
ture being targeted and how it should be altered.
node above them. Figure 4 shows the tree after this
The most signiﬁcant functions are described in more
has been performed.
detail below. This is not an exhaustive list, how-
ever, as several trivial changes have been omitted.
These include things like node relabelling and the
addition of group nodes for conjunctions. There
are many changes of this sort, some introduced by
the speciﬁc mechanics of altering the tree. They
are not generally interesting differences between
the constituency representations of the Treebank’s
phrase-structure representation and systemic func-
A Lorilard spokeswoman said
an old story
Raising verb phrase predicates
Figure 4: Equally ranked clauses
The most obvious difference between SFG con-
stituency and the Treebank II annotation scheme is
the ﬂatter, ‘minimal bracketing’ style SFG uses. To
convert a tree to SFG clause constituency, all com-
In the Treebank II annotation scheme, each auxil-
plements and adjuncts must be raised by attaching
iary — and the main verb — is given its own node,
them to the clause node; in the Treebank annotation
dominated by the auxiliary before it. This structure
they attach to the verb. Figure 2 illustrates the rais-
needs to be ﬂattened to match the SFG representa-
ing of clause constituents from the verb phrase.
tion. If all of a verb phrase’s lexical items have POS
tags in the following list: VB, VBD, VBG, VBN,
Raising hypotactic clauses
VBP, VBZ; and it only has one verb phrase child,
SFG represents the distinction between hypotaxis
then its lexis attaches to the verb phrase below it.
and parataxis with features, rather than tree struc-
The empty internal node will later be removed in
siblings of the dominant verb phrase (such as the
subject), all lexis of the dominant verb phrase (such
as the ﬁnite), and all children of the ellipsed verb
phrase (such as the complement) are copied to the
new clauses. In effect, the only items in the ‘orig-
inal’ clause that are not in the ‘ellipsis’ clauses are
children of the ﬁrst verb phrase (such as the adver-
It is not entirely clear that copying the words is
the best solution. A trace — an empty group that
Figure 5: Treebank representation of a sentence that
simply references the original version — is possi-
contains a verbal group complex
bly more convenient. The trace solution is more
convenient when using the corpus as training data
the generic ‘ﬂattening’ stage.
for a computational linguistics task, while copying
the elements makes the corpus easier to use for lin-
Verbal group complexing
guistic research. The SFG literature is unhelpful for
SFG distinguishes between clause complexes and
these kinds of decisions: it is concerned with con-
verbal group complexes. The rules for parsing a tree
tent descriptions, not representation descriptions.
as one or the other type of construction are quite
Pruning and truncating
If a verb phrase has one verb phrase child, and
Lexical nodes that contain only punctuation or
dominates a lexis node that is not a ﬁnite, then it
traces are pruned from the tree. Group nodes that
is treated as a verbal group complex. Additionally,
contain no lexis are also pruned. This operation is
if a verb phrase has a sentence child that is not a
performed recursively, from the bottom up, clearing
direct quotation, does not have the function tag PRN
away any branches that have no lexical leaves. In-
(parenthetical), and is not labelled SBAR (used for
ternal nodes that contain only one child are replaced
relative and subordinate clauses), it is treated as a
by that child, truncating non-branching arcs of the
verbal group complex. For example, SFG renders
the tree in Figure 5 as a single clause, with the verbal
The clearance of punctuation is a problem with
group “(continued) (to slide)”.
the script as it currently stands, since clearly this
Group and phrase complexing is actually repre-
information should not be lost.
sented a little inaccurately in the script. Ideally,
Adding Metafunctional Analysis
a structural Complex node should be created, and
all groups attached to it. This representation would
Function structures must be added after the con-
mirror the way clause complexing is handled. In-
The structures attach to
stead, group or phrase complexing is treated like
clauses in the constituency tree, making separation
rank-shifting, with the ﬁrst group dominating the
into clauses essential before systems can be anno-
others. This concern is not crucial, however, since it
does not affect the clause division or the annotation
Function structures fall into two categories:
of function structures.
metafunctional roles, and systems. Metafunctional
roles describe the interpersonal, textual or ideational
function of a particular constituent, which is consid-
Ellipsis was the most difﬁcult case to deal with,
ered the role’s realisation. Systems are instead dis-
since it involves more than just relocating nodes in
junctions from which a term is selected if the entry
the tree. A new clause is created when a verb phrase
condition is met. The names of metafunctional roles
is identiﬁed as part of a clause with an ellipsed sub-
are generally capitalised in the literature, while sys-
ject. The verb phrase is moved to the new clause,
tem names are given in italics. We follow this con-
along with all of its children, and any items identi-
vention to help make the distinction clearer.
ﬁed as ellipsed are copied and attached. Lexis that
As with the constituency conversion, function
is copied in this way must be renumbered, so that
structures were added by hard-coded functions, im-
the clause sorts properly.
plemented as a Python script. Four kinds of infor-
When a verb phrase has two or more verb phrase
mation are used for metafunctional analysis:
children, each verb phrase child after the ﬁrst is
moved to a new clause. Figure 6 shows the struc-
1. The Penn Treebank’s function tags
ture of a sentence containing an ellipsed clause. The
2. The Penn Treebank’s POS tags
Figure 6: Treebank representation of an ellipsed clause, with verb phrases named
3. The value of other systems
erwise the ﬁrst word of the verbal group receives the
interpersonal role Finite.
4. The order of constituents in the SFG represen-
The use of values from other systems makes the an-
Predicator is an interpersonal role. The Predicator
notation procedure order dependent. They are usu-
is the lexical verb of a verbal group.
ally used to determine whether a system’s entry con-
If a clause is minor class, it does not contain a
dition has been met. For instance, tense is not se-
Predicator. Otherwise, the last word of the verbal
lected by non-ﬁnite clauses — so the function that
group receives the interpersonal role Predicator. If
discerns tense ﬁrst checks the that requirement, and
a verbal group has only one word, that word will
assigns null tense if the clause has no Finite.
therefore receive two interpersonal roles (Finite and
The subsections below give a brief linguistic de-
Predicator). This is the analysis recommended in
scription of the system being annotated, and then
the literature (Halliday, 1994).
describe the way its selection is calculated. If the
entry condition is not met, the selection is consid-
Status is an interpersonal system with the possible
values ‘free’ and ‘bound’. Status refers to whether
a clause is ‘independent’ or ‘dependant’, to use the
Class is an interpersonal system with the possible
terms from traditional grammar.
values ‘major’ and ‘minor’. Major clauses are those
Minor clauses do not select from the status sys-
with a verbal group. Minor clauses are equivalent
tem, so receive the value ‘none’. Major clauses that
to sentence fragments in other grammatical theories.
have no Finite, or were originally attached to an-
An example from the Penn Treebank is the fragment
other clause and were tagged SBAR, or are rank-
“Not this year.”
shifted, are considered bound. All other clauses are
If a clause contains a verbal group, it is marked
‘major clause’. If it has no verbal group, it is marked
Subject is an interpersonal role. The Subject of a
verbal group is the nominal group whose number
Finite is an interpersonal role. The Finite is the tense
the verbal group must agree with.
marker of a verbal group. It is either the ﬁrst auxil-
Nominal groups realising Subject are generally
iary, or it is included with the lexical verb as a mor-
tagged explicitly in Treebank II annotation. The ex-
phological sufﬁx. The Finite is a signiﬁcant unit of
ception to this is wh- subjects like ‘who’, ‘what’ or
the grammar, because the placement of it in relation
‘which’. If no nominal group has the function tag
to the Subject realises mood type, and its morphol-
SBJ, and there is a wh- nominal group that was not
ogy realises tense selection and number agreement
attached to the verbal group, that nominal group is
with the Subject.
considered the Subject.
If a clause is minor class, or the ﬁrst word of its
In clauses with an Initiator (‘I made him paint the
verbal group has one of the following POS tags: TO,
fence’), two nominal groups will usually have been
VBG, VBN; then it does not contain a Finite. Oth-
marked subject (‘I’, ‘him’). In these cases, the ﬁrst
occurring nominal group is considered the subject
Polarity is the simplest system to determine, since
it only involves checking the verbal group for the
word “not” (or “n’t”). Looking at negation more
generally would be far more difﬁcult, since it is
Mood type is an interpersonal system with the possi-
more of a semantic motif than speciﬁc grammatical
ble values ‘declarative’, ‘interrogative’ and ‘imper-
ative’. Mood type refers to whether a clause is con-
gruently a question (interrogative), command (im-
Adjuncts, Complements, Vocatives
perative) or statement (declarative).
Adjunct, Complement and Vocative are interper-
Minor and bound clauses do not select from
sonal roles. Nominal groups can be either Voca-
this system, and therefore receive the value ‘none’.
tives, Adjuncts or Complements. Adjuncts repre-
Free clauses with no subject are marked ‘impera-
sent circumstances of a clause — the where, why
tive’. Clauses with the node labels SQ or SBARQ
and when of its happening. Complements represent
are marked ‘interrogative’. Other free clauses are
its non-Subject participants — the whom, to whom
and for whom of its happening. Vocatives are nom-
inal groups that name the person the clause is ad-
Tense is an interpersonal system whose value
Adverbial groups, prepositional phrases and par-
is some sequence of ‘present’, ‘past’, ‘future’,
ticles are always given the interpersonal function
‘modal’. Tense refers to the temporal positioning of
‘Adjunct’. Vocatives are explicitly marked in the
the process of a clause, with respect to the time of
Treebank, with the VOC tag.
speaking. In English, it is a serial value, because se-
that realise an adverbial function are also explicitly
quences of tenses can be built (‘have (present) been
tagged, with either TMP, DIR, LOC, MNR or PNR.
(past) going (present)’).
Nominal groups with one of these tags receive the
Finite declarative and interrogative clauses re-
interpersonal role ‘Adjunct’. All other non-Subject
ceive one or more tense values. The function iter-
nominal groups receive the interpersonal role ‘Com-
ates through the words of the verbal group (or the
ﬁrst verbal group in a verbal group complex), and
assigns these values based on the words’ POS tags,
and in special cases their text.
If a tag is either VBD or VBN, the value ‘past’
Voice is a textual system with the possible values
is appended to the tense list. If the tag is either
‘active’, ‘passive’ and ‘middle’.
Voice refers to
VB, VBG, VBZ or VBP, the value ‘present’ is ap-
whether the Subject is also the ‘doer’ of the clause,
pended to the tense list. If the tag is MD, then the
or whether the participants have been switched so
text is checked. If the word is “’ll”, ‘will’ or ‘shall’,
that the Subject is the ‘done to’. Compare the active
the value ‘future’ is appended to the tense list. The
clause “the dog bit the boy” with the passive version
value ‘modal’ is appended to the tense list for lexi-
“the boy was bitten by the dog”. If clauses do not
cal items tagged MD. When an MD tag is seen, the
have a ‘done to’ constituent which might have been
next word in the list is skipped, since it will be a
made Subject (i.e. a Complement), they are consid-
bare inﬁnitive that does not represent a tense selec-
ered ‘middle’ (‘the boy slept’).
tion. If the lexical items ‘going’ or ‘about’ are seen,
Minor clauses do not select for voice, and there-
the value ‘future’ is appended to the tense list, and
fore receive the value ‘none’. Non-ﬁnite clauses are
the next two words are skipped, as they will be ‘to’
typed according to the POS tag of their Predicator.
and an inﬁnitive verb. This does not occur if ‘go-
If the tag is VBG, voice is determined to be active;
ing’ is the last word of the verbal group, since in
if the tag is VBN, voice is determined to be passive.
that case it is the process, not a tense marker.
Inﬁnitive non-ﬁnite clauses receive the value ‘none’.
Passive clauses will have received an extra ‘past’
Finite clauses with a ﬁnal tense other than ‘past’
tense value, so when a clause is labelled passive, its
are labelled active. If the ﬁnal tense is ‘past’, and
last tense selection is removed.
the penultimate word of the verbal group is a form
of the verb ‘be’, the clause is labelled passive, and
the tense sequence is corrected accordingly.
Polarity is an interpersonal system with the possible
Active clauses are then subtyped into true ac-
values ‘positive’ and ‘negative’. Polarity refers to
tive and middle voices. Middle clauses are active
whether the verbal group is directly negated.
clauses which have at least one complement.
the case of Modal and Comment Adjuncts), or Tex-
Theme and Rheme are textual roles. Theme refers
tual Theme (in the case of Conjunctive Adjuncts).
to the order of information in a clause.
The Wall Street Journal corpus, which was the
Theme/Rheme structure of a clause is often called
only section of the Penn Treebank available for this
Topic/Comment in other theories of grammar. The
research, contains very few Mood, Comment or
Theme is the departure point of information in a
Conjunctive Adjuncts, so the extent of this problem
clause. The Rheme is the information not encom-
could not be properly measured.
passed by the Theme.
The ﬁrst Adjunct, Complement, Subject or Pred-
icator that occurs is marked ‘Topical Theme’. Any
This work is approximately ten years overdue, in the
conjunctions that occur before it are marked ‘Tex-
sense that that is how long the resources required
tual Theme’, while any vocatives or ﬁnites that oc-
to perform it have existed. The motivations for it
cur before it are marked ‘Interpersonal Theme’. All
are even older: corpus linguistics has been a pil-
other clause constituents are marked ‘Rheme’.
lar of systemic functional linguistic research since
it began, and raw text corpora are inadequate for
many of the questions systemic functional linguis-
Accuracy was checked using 100 clauses that had
tics asks (Honnibal, 2004). The ﬁrst effort to con-
not been sampled while the script was being de-
vert the Penn Treebank to another representation
veloped or debugged.
Each clause was checked
was presented within months of the corpus’s com-
for constituency accuracy to the group and phrase
pletion (Wang et al., 1994). Since then, treebanks
rank — i.e., clause division and clause constituency
have been converted to several grammatical theories
were checked. Each of the eleven function struc-
(cf. (Lin, 1998; Frank et al., 2003; Watkinson and
tures were also checked:
clause class, status,
Manandhar, 2001)). It is unclear why SFG has been
mood, tense, polarity, Subject, Finite, voice, Topi-
left behind for so long.
cal Theme, Textual Themes, Interpersonal Themes.
A corpus of over two million words of SFG con-
Two errors were found, both on the same clause.
stituency analysed text, annotated with the most im-
The status selection of an indirect projected speech
portant clause rank interpersonal and textual sys-
clause was marked ‘free’ instead of ‘bound’. This
tems and functions, is now available. This is an
occurred because the projected clause was top-
important resource for linguistic research, the devel-
icalised (i.e., it occurred before the projecting
opment of SFG parsers, and research into applying
clause), which is rare for indirect speech. To cor-
systemic linguistics to language technology prob-
rect this, the script must consider the presence or
absence of quotation marks, which may be com-
plicated by the slightly inconsistent attachment of
punctuation in the Penn Treebank (Bies, 1995). Be-
I would like to thank Jon Patrick for his useful feed-
cause the status of this clause was given as free,
back on this paper. I also owe thanks to the many
the clause incorrectly met the entry condition for
people who have helped me on my honours the-
the mood type system, causing the second error —
sis, from which this paper is mostly drawn. Chris-
a mood type selection of ‘declarative’ instead of
tian Matthiessen and Canzhong Wu have both been
wonderful supervisors. Paul Nugent helped with
In this somewhat small sample, 1198/1200
the graphics used in this paper and my thesis, and
(99.83%) properties were correct, and 99% of
proof-read with invaluable diligence. James Salter
clauses were annotated without any errors.
has shown remarkable patience over the last year
lack of plausible Adjunct subtyping may present
and half while teaching me to program. Finally,
problems for the accurate determination of Topical
the language technology research group at Sydney
Theme in a more register varied sample, such as the
Uni have all contributed sound advice and interest-
ing discussions on my work.
Adjuncts should be subtyped into Modal Ad-
juncts (such as ‘possibly’), Comment Adjuncts
(such as ‘unfortunately’), Conjunctive Adjuncts
A. Bies. 1995. Bracketing guidelines for Treebank
(such as ‘however’) and Experiential Adjuncts (such
II style. Penn Treebank Project.
as ’quickly’). Only Experiential Adjuncts can be
Maria Herke Couchman and Casey Whitelaw.
Topical Theme; if another kind of Adjunct occurs
2003. Identifying interpersonal distance using
ﬁrst it should be marked Interpersonal Theme (in
systemic features. In Proceedings of the ﬁrst
Australasian Language Technology Workshop
Anette Frank, Louisa Sadler, Josef van Genabith,
and Andy Way, 2003. From Treebank Resources
To LFG F-Structures - Automatic F-Structure An-
notation of Treebank Trees and CFGs extracted
from Treebanks. Kluwer, Dordrecht.
Michael A. K. Halliday. 1966. The concept of rank:
a reply. Journal of Linguistics, 2(1):110–118.
Michael Halliday. 1969. Options and functions in
the english clause. Brno Studies in English.
Michael Halliday. 1976. System and Function in
Language. Oxford University Press, Oxford.
Michael Halliday. 1994. Introduction to Functional
Grammar 2nd ed. Arnold, London.
Ruqaiya Hasan. 1987. The grammarian’s dream:
lexis as most delicate grammar. In Halliday and
Fawcett, editors, New developments in systemic
linguistics: theory and description. Pinter, Lon-
Matthew Honnibal. 2004. Design, creation and use
of a systemic functional grammar annotated cor-
pus. Macquarie University.
Dekang Lin. 1998. A dependency-based method
for evaluating broad-coverage parsers. Natural
Language Engineering, 4(2):97–114.
M. Marcus, G. Kim, M. Marcinkiewicz, R. Mac-
Intyre, A. Bies, M. Ferguson, K. Katz, and
B. Schasberger. 1994. The Penn Treebank: An-
notating predicate argument structure. In Pro-
ceedings of the 1994 Human Language Technol-
Christian M. I. M. Matthiessen and John A. Bate-
”Text generation and systemic-
functional linguistics: experiences from English
and Japanese”. ”Frances Pinter Publishers and
St. Martin’s Press”, ”London and New York”.
Christian Matthiessen. 1995. Lexicogrammatical
Cartography. International Language Sciences
Publishers, Tokyo, Taipei and Dallas.
Robert Munro. 2003. Towards the computational
inference and application of a functional gram-
mar. Sydney University.
Jong-Nae Wang, Jing-Shin Chang, and Keh-Yih Su.
1994. An automatic treebank conversion algo-
rithm for corpus sharing. In Meeting of the As-
sociation for Computational Linguistics, pages
S. Watkinson and S. Manandhar. 2001. In Proceed-
ings of the workshop on evaluation methodolo-
gies for language and dialogue systems.