Reading visual and multimodal texts: how is ‘reading’
Senior Lecturer in Literacy Education
Australian Catholic University
This paper examines the differences between reading print-based texts
and multimodal texts within the context of changed literacy practices.
The author closely analyses aspects of a novel, a picture book and an
internet site to determine the similarities and differences in the way
readers would process each text. The ‘affordances’ of modes are
considered in relation to a text’s purpose and meaning-making
In the realm of literacy education there is much discussion of the textual shift, and thus
‘paradigm shift’ (Bearne, 2003) that has occurred for today’s students whose
environment is filled with visual, electronic and digital texts, those texts that are
referred to as ‘multimodal’ (Kress & Van Leeuwen, 2001; Kress et al, 2001; Kress,
2003; Unsworth, 2001, 2002, 2003). Several researchers, nationally and internationally
(Arizpe & Styles, 2003; Callow & Zammitt, 2002; Jewitt, 2002; Lankshear, Snyder &
Green, 2000; Lankshear & Noble, 2003; Lemke, 2002; Gee, 2003), are investigatiing
what new theories of literacy and new pedagogies are needed to respond to a changed
learning environment. This paper considers the reading process within this changed
context. Although multimodal texts and print-based texts are not mutually exclusive, I
will examine some of the differences that occur in the reading of multimodal texts
compared with the reading of print-based texts.
Two theoretical perspectives are brought together with the purpose of considering a
conceptual framework for the reading of multimodal texts. The first perspective is
based on established theories of reading education that have been traditionally applied
to print-based texts and mostly monomodal texts. The second perspective draws on
recent innovative research and conceptualisation by others regarding the reading of
images and multimodal texts.
Multimodal texts are those texts that have more than one ‘mode’ so that meaning is
communicated through a synchronisation of modes. That is, they may incorporate
spoken or written language, still or moving images, they may be produced on paper or
electronic screen and may incorporate sound. Different types of multimodal texts that
students commonly encounter in their educational environment in print form are picture
books, information books, newspapers and magazines. Multimodal texts in non-print
form are film, video and, increasingly, those texts through the electronic screen such as
email, the internet and digital media such as CD Roms or DVDs.
The ‘reading process’ with print-based texts
Research over several decades has established that reading incorporates socio-cultural
and contextual dimensions together with cognitive, affective and visual processes. Luke
and Freebody’s reading practices model (1999) and Durrant and Green’s three
dimensional model (2000) are both a culmination and incorporation of traditional and
newer theories of reading. Reading involves different levels of decoding, responding
and comprehending at affective and cognitive levels, critiquing and analysing. Reading
is not static, it is a constant interaction between reader and text. This interaction
between reader and text can occur within a number of contexts simultaneously: the
social or cultural context of the individual reader, the socio-cultural context of the text
production, the genre and purpose of the text, the interest and purpose of the reader and
the immediate situation in which the text is being read at any particular moment. The
relationship between the reader and the text within the whole reading process is a two-
way recursive and dynamic interaction that occurs within both an immediate and wider
Interaction between reader and text does not occur without what is traditionally referred
to as decoding. Decoding involves using strategies of word recognition, pronunciation,
vocabulary knowledge, and the recognition of graphic, morphemic and phonemic
patterns. For the proficient reader these happen unconsciously. Levels of meaning,
depending on the type of text, can be enhanced by the reader’s background knowledge
of the world, of how language works and of how texts work as well as the recognition
of discourses and ideologies. There are different aspects of previous knowledge that a
reader may ‘cue’ into in the act of reading and these may be cultural knowledge,
general knowledge, specific content knowledge, or linguistic knowledge. Both
intertextuality and intratextuality are important aspects in the process and in the way a
reader ‘fills in gaps’. These gaps are those aspects that a reader needs to visualise, infer,
predict, conceptualise and imagine as the words of a text will never be able to ‘tell’
everything. Critical reading is an important part of the reader identifying different
discourses and understanding what ideologies are presented.
Reading in a multimodal environment
Is the reading of multimodal texts a different process from the reading of print-based
texts? A reader of a picture book or an information book needs to simultaeneously
process the message in the words, picture, images and graphics. With an electronic or
digital screen there will be added combinations of movement and sound. Kress and van
Leeuwen (1996, 2001) have challenged the notions of traditional literacy’s emphasis on
print in the light of the growing dominance of multimodal texts and digital technology.
They contend that a language based pedagogy is no longer sufficient for literacy
practices that are needed in our information age. Crucial issues being raised by Kress
and others (e.g. Heath, 2000; Bearne, 2003) are that ‘the screen’ and multimodal texts
are developing new ways of communication. Written text is only one part of the
message and no longer the dominant part. Heath (2000) has argued that visual texts are
impacting on ‘neural networks’ and changing conceptual schemata. New types of texts
require different conceptualisations and a different way of thinking. Kress (1997,
2003) describes significant differences between the words and images. He shows that,
with writing, words rely on the ‘logic of speech’ involving time and sequence, whereas
the ‘logic of the image’ involves the presentation of space and simultaneity. Thus the
reading of visuals involves quite a different process than the reading of words. Kress
and Bearne (2001) have shown that schools foster the ‘logic of writing’ whereas
contemporary children’s life experiences are grounded in the ‘logic of the image’ and
the ‘logic of the screen’.
Reading of print-based texts compared with reading multimodal texts:
similarities and differences
Keeping these research developments in mind, I will examine the process of ‘reading’
more closely by comparing the reading of print-based texts with multimodal texts. In
the following discussion I demonstrate similarities and differences that may occur in
the reading of three texts that use the subject of a ‘wolf’. The first two are literature
texts, a novel and a picture book, and the third is an information text on a web site. All
would be suitable for students in an upper primary or a junior secondary class.
Reading words: reading a novel
Figure 1 presents an extract from the beginning of a children’s novel, Milo’s Wolves.
The unknown brother
I wouldn’t like you to get the wrong idea. Milo’s wolves don’t have tails and
fangs, and they don’t go in for moonlight howling. Milo’s wolves are more like the nine
lives a cat’s supposed to have (though ours only had one).
Milo McCool is my father, and Mary is my mother. It was Mary who told us
about the wolves. Apparently, there was this very famous athlete in Ancient Greece
called Milo. When he was getting on a bit he tried to tear an oak tree apart and got his
hand stuck in it. The wolves ate him up before he could free himself. (What I’d like to
know is, where were all his friends when he needed them?)
Figure 1. From Milo’s Wolves (Nimmo, 2001, p.3)
What does a reader need to do to read this text with meaning? First of all the reader’s
curiosity would be aroused by both the unusual title of the novel itself and the title of
the first chapter. An important aspect of reading is prediction and both these headings
invite the reader to predict, guess, imagine or question aspects such as who or what is
Milo? What sort of wolves are these? What have the wolves and Milo got to do with an
‘unknown brother’? Immediately the reader is cueing into the context of this literary
genre of narrative with a sense of mystery and adventure, possibly drawing on any
knowledge about wolves or stories about ‘unknown’ characters.
The reader has to instantly cue in to the fact that the first person narrator is used – a
common technique for children’s novels – and to realise that they are being given a
filtered point of view through this focalising character. There is a very strong
interpersonal exchange between narrator and reader through the use of the personal
pronouns, “I”, “you”, “us”. Experience of other novels that use this technique would
help the reader. There is an accompanying colloquial style with this form of narrative,
more a spoken than a written mode. Familiarity with how this language is used in
literature is needed. Several intertextual references are made, such as the reference to
Greek mythology, to the habits of wolves and to the folk idiom of a cat having ‘nine
lives’. The reader has to understand the way wolves are referred to metaphorically, the
way this metaphor will be sustained throughout the narrative and to detect the
humorous undertone of the asides in parenthesis.
The orientation of the novel suggests an aura of mystery with the use of the one word
‘unknown’ in the heading for the chapter ‘the unknown brother’. The reader is
immediately brought into the context of a family situation with the narrator telling us
the names and some details about her/his parents. The tone and asides suggest that this
will be a humorous novel. The fact that the narrator calls the parents by their first
names is unusual giving the reader a sense that she/he is more ‘equal’ to them, even in
some ways ‘superior’ to them. The names and the personal pronouns convey
particularly the interpersonal meaning and establish a relationship between child reader
and child narrator.
For a proficient reader, all these understandings – particularly previous knowledge of
narratives - would interract simultaneously with intertextual links, an important aspect
of reading literary texts. This process is an excellent example of how much of reading
does not rely on the decoding of print but occurs ‘before we even look at the page’
(Wallace, 1990). For full appreciation of the meaning the reader has to be ‘reading’ at a
number of different levels – interpersonal, symbolic, social. These are all ‘inside the
head’. All this meaning is conveyed through the choice of words and the way they are
arranged, in other words, the grammar of the text.
Reading words and images: reading a picture book
How does ‘reading’ occur when images are part of the text? Is the reading process as
described for the novel, a print-based text, applicable to the reading of images in a
picture book? As a comparison, I will examine the cover of the narrative picture book
The Wolf (Barbalet and Tanner, 1991) which also uses a ‘wolf’ metaphor within the
The cover, reproduced in Figure 2, shows the words of the title, The Wolf, at the top
within a framed illustration of three children sitting around a table. Two of the children
are concentrating on stacking playing cards into a pyramid formation. Another child, a
girl, is looking out towards either the reader/viewer or something else. In the
background is a dark sky with light reflected from the moon onto some clouds. Some
light is reflected on the children’s faces, more so on that of the girl on the right. The
frame is surrounded by a blue that tones in with other hues of blue and blue-grey in the
clothes, the sky and the children’s eyes. The name of the author and illustrator at the
bottom are outside the frame, separated from the title and its picture.
Figure 2. Front cover from The Wolf (1991). Written by Margaret Barbalet, illustrated
by Jane Tanner. (Reproduced with permission from Penguin Books Australia Ltd.)
There are a number of things that a reader needs to know in order to be able to ‘read’
the cover and to begin to predict how a narrative picture book with this title will
develop in plot. While there are only the two words of the title to introduce this story
the picture itself conveys meaning. Both contextual knowledge and background
knowledge are needed for the reader to infer that there is something mysterious and
frightening suggested by the title ‘The Wolf’ with no wolf, but three children, shown in
the picture. This absence of the wolf, the darkness and the full moon give ‘the wolf’ a
more sinister connotation than if a wolf was included in the scene or if the text was an
information book about wolves.
An experienced reader will instantly know this book will be a narrative. Images of
wolves are highly significant in literature of western society and this cultural
knowledge may be invoked. A reader may feel the unease created by the image – the
children seem as if they are waiting, trying to distract themselves with the cards. What
are they waiting for? What has the wolf to do with them? They could be inside but at
the same time seem to merge towards the evening landscape of clouds, sky and moon.
The absence of a barrier between a safe world inside and the world outside reinforces
the sense of threat. Where is the wolf? What is suggested by the image of the playing
cards? A reader who is familiar with the meanings constructed by particular narratives,
would glance at this cover and understand many of these aspects instantly and respond
imaginatively or affectively.
These types of responses would occur unconsciously in the same way that a fluent
reader makes meaning from a written text, yet the responses are evoked by the effect of
visual codes such as colour, framing, line, angle, perspective and vectors, in other
words the ‘visual grammar’ (Kress & van Leeuwen, 1996; Simpson, 2004; Unsworth,
2001). In interpreting meanings from images we don’t need to ‘decode the words’ as
with print but we do need to be able to ‘break the visual codes’ in a different way. This
involves a different type of interpreting of a different coding system. We need to be
able to identify where the image-maker is using colour, position, angle, shape and so on
to construct a meaning. There are other effects of images that are different from words,
particularly at the affective, aesthetic and imaginative levels.
For example, with the cover of The Wolf the different hues of blue, grey and brown
suggest a sombre, suspenseful mood that is reinforced by the contrasting effects of dark
and light. Intensity is created by the facial expression of each child who is either staring
at the cards or out at us. The angle is medium distance which draws the reader towards
involvement with the characters especially with the girl who is looking out at us and
‘demands’ us to be drawn into this situation, to be curious and uneasy.
With the picture book, interaction between reader and text is different because of the
use of images and how images interract with words. The image is different from the
words that we read sequentially and syntactically. The image is there at once and fills
the page. Do the reader’s understandings and responses all happen holistically and
simultaneously? How do we know what part of the picture the reader’s eyes go to first
and in what order? What is the reading path? Do our eyes go immediately to the girl on
the right hand side of the image because of the diagonal vector of light between the
moon and her face.
For both the reading of words and the reading of pictures for these two texts, Milo’s
Wolves and The Wolf, the similar processes that would occur would be prediction, the
activation of schema or repertoire and cueing in to various contexts. For each text the
reader would be questioning and imagining a plot while drawing on background
knowledge of the world, and knowledge about narrative genre whether it is presented
through words or pictures. A reader would also be responding to interpersonal
meanings in each text. The purpose of each text is similar because they are both literary
narratives and written to engage a reader in the story at a number of different levels.
The wolf metaphor is used in the novel for humour and irony while in the picture book
for representing fear.
Reading words and images: reading an electronic text
Each of the above literature texts, though very different, use the ‘wolf’ as a metaphor
for their narrative. Use of metaphor and multiple layers of meaning is itself a
characteristic of narratives whether in the form of novels, picture books or film.
Meaning is developed differently in information texts, whether in print or electronic
form. To stay with the theme of ‘wolves’ I will now discuss a website on the internet
that is an information text. It is the website for The International Wolf Center as shown
in Figure 3.
Figure 3. International Wolf Center homepage: htttp://www.wolf.org/wolves/index.asp
Consider what a reader has to do to gain meaning as she/he enters this site. With this
text the technological differences [i.e. screen, windows, frames, links, navigation bars,
menu buttons, use of cursor, mouse] are designed to assist the reader’s learning, to
attract and to maintain interest. A reader entering this site will do so to obtain
information about wolves, will have expectations firstly about wolves – depending on
the knowledge she/he already has - and secondly about what information the site will
provide. The focus of this site claims to be about learning as it states its aim to ‘teach
the world about wolves’. The reader can find all kinds of information about wolves
with a variety of linked activities. The background the reader brings to the website text,
needs to link with the geographical and scientific contextual information and the
understanding of the information that, again, occurs ‘inside the head’.
This is a totally different text from the literature texts and has a different purpose as it
is an information genre. It does not work in symbols or metaphors but in providing
factual information in words, graphics and images. Its purpose is to give the reader who
enters this text a variety of information about different types of wolves, their habitat
and characteristics. At the same time the site is using many strategies to obtain financial
commitment from the reader and the reader needs critical discernment to recognise the
strategies of persuasion. The layout of the home page consists of several framed
sections, with links to other pages, that are each designed to engage us to not just learn
about wolves but to become involved in the “wolf center” in some way. Therefore the
home page is communicating to us as if we are members of a club with the ‘Welcome’
section being the largest framed part, and with the other framed sections having
headings such as ‘In the News’, ‘This month’s special’, and ‘Hot News’. We can choose
to become involved with the staff to learn more about wolf pups, so we can ‘hike, howl,
canoe and join the wolf pup staff…’ on excursions, for a fee of course. The language of
the menu buttons invite us to be active participants, e.g. ‘learn’, ‘experience’, ‘support’,
‘visit’, including the commercial ‘shop’ section.
A reader can choose different pathways depending on their interest. For example if they
choose the ‘Experience' menu button there are offers for outdoor adventure programs
with colourful photos of people trekking in the snow at
<htttp://www.wolf.org/wolves/experience/experience.asp>. The reader is invited to
‘Meet our Wolves’ so can link to this page at
<htttp://www.wolf.org/wolves/experience/meet/meet_main.asp>. On this page there are
photos of individual wolves with their names under the photo. A reader can click on the
photo to go into a digital ‘photo album’ with separate windows opening to show other
pictures of that particular wolf with some written information. Alternatively you can
click on to the name of the wolf for more detailed information on another screen.
Photographs are most important to represent ‘real’ wolves in their real-life settings. On
the home page, the most salient image is the photograph of the white-gloved hands
holding the wolf puppy, signifying the commitment of this centre to care for wolves.
Authenticity is important to this centre’s communication so one section of the site uses
a digital camera or ‘wolf cam’ to provide up-to-date photographs of wolves in their
geographical setting to within ‘60 seconds’ of the viewer clicking on to the site. Here is
an attempt at creating virtual reality at
In the ‘Learn’ section at <http://www.wolf.org/wolves.learn/learn.asp> there are
options for either students or teachers to choose different pathways for sites that may
meet their needs.
Unlike a continuous narrative, every page of this website is fragmented into framed
sections so that information is segmented. There is no beginning or end and the reader
chooses their own pathway by using the menu buttons along the top of the screen or
clicks on to hyperlinks within frames. There is no need to go to every page or every
link. What sort of reading is happening here then? What will a reader do first – read the
words or the images or use the cursor to move around the screen and to click on to
different links without necessarily reading every word?
The similarities in the reading of these three different texts occur in the meaning-
making and interpreting process. Whether ‘reading’ words or images, or both, in a
novel, non-fiction text, a media text, a picture book, an information text or an
electronic screen we need to be able to understand the message to make meaning. We
need to understand the social purpose of the particular text and its cultural context and
this understanding will be linked to our own purpose in using the particular text.
Whatever the text we often need to ‘fill in the gaps’ to understand the cultural, social
and specific contexts. Any understanding is going to be tied to our previous experience
or knowledge in some way. The previous discussion of each of the three texts would
suggest that the reader’s schema is a major factor in both the reading of print-based or
multimodal texts. The way we interpret any new text, whether words or images, will
then produce new interpretations, new responses, and new meanings. We go through a
recursive, interactive process as we read words or look at images, negotiate electronic
screens and hyperlinks. We make links with our previous experiences of words, images,
screens and their content, then make new meaning. A new text will trigger these new
responses and interpretations. These are processes that go on ‘inside the head’ of the
reader. These ‘inside the head’ processes that contribute to meaning-making are
summarised in Table 1.
Table 1 Similarities in the reading of print-based texts and multimodal texts:
• Understanding of wider sociocultural context.
• Any text is part of a particular ‘genre’ (e.g. literary, information, media, internet,
• Reader adjusts expectations according to text type or purpose.
• Various schemata are activated – background knowledge, knowledge of topic, knowledge of
• There is an interaction between reader and text for meaning to be made. Meaning can be
made with ideational, interpersonal or textual metafunctions. The reader is ‘engaged’.
• Understanding and interpreting at cognitive & affective levels. [e.g. literal, inferential, critical
responses, empathising, analogising.]
• Understanding, analysing and critiquing ideologies, point of view, ‘positioning’.
• Imagination can be activated.
• Information can be obtained.
• There is a a specific context, discourse and coherence.
• Skills specific to each type of text need to be activated by the ‘reader’/viewer [e.g.
aesthetic/efferent; predicting or scanning/skimming]
These processes are all part of meaning-making, the core of reading behaviour, as well
as all communication. The processing will occur depending on the type of text, its
purpose and the reader’s purpose. There are, however, many differences that occur with
different text genres as well as with the wide range of multimodal texts. If meaning-
making occurs as a basic process for reading all types of texts, the differences then
must be related to the way different modes contribute to the process.
Differences: processing modes
What are these differences? Clearly differences are dependent on the way modes are
processed and how particular modes activate a meaning-making process for the reader.
In multimodal texts, compared with print-based texts, the reader will use various senses
(sight, hearing, tactile, kinaesthetic) to respond to other modes. Table 2 summarises
some of the differences that may occur.