A NEW METHODOLOGY: DATA ELICITATION FOR SOCIAL AND
REGIONAL LANGUAGE VARIATION STUDIES
Carmen LLAMAS
Abstract
This paper presents a new method of data elicitation for use in large-scale regional language
variation studies, and for use in sociolinguistic studies of a given area. The methodology was
devised and designed to fit the requirements of a national collaborative venture, the Survey
of Regional English (SuRE).1 It was then expanded for use in a sociolinguistic study of
Teesside English currently being undertaken by the author.
1. Introduction
Many and varied methods of eliciting data for analysis of language variation exist and
research is continually being undertaken in the field. Recent and ongoing projects offer
detailed knowledge and insight into linguistic variation and change in Britain (e.g. the British
Dialect Grammar survey by Cheshire et al. (1989), in Milton Keynes by Kerswill and
Williams (1997), in Milton Keynes, Reading and Hull by Cheshire et al. (1999), in Tyneside
and Derby by Docherty et al. (1997)). However, researchers wishing to compare their
findings with those of another study are faced with individual projects which have different
aims and employ different methodologies. This makes direct comparisons of studies
potentially problematic (see Foulkes and Docherty (1999) for further discussion). Given the
possible process of dialect levelling (cf. Williams and Kerswill (1999), Watt and Milroy
(1999)) and the spread of current vernacular changes in certain phonological and
grammatical features in British English, the availability of studies which are regionally
disparate but directly comparable would be enormously advantageous.
Knowledge of current regional and social lexical variation in the British Isles is
extremely sparse, with few studies being or having been undertaken. The studies which have
been made generally utilise a similar method of data collection, namely the questionnaire.
However, these studies, again, are not necessarily comparable, since different notion words
have been used or sought.
No individual study of a given area has attempted to combine investigation of social
variation in spreading and localised features found in phonology, grammar and lexis. This
paper presents a new methodology designed to do just that. The core methodology has
been created for use in the proposed new survey of variation in the spoken English of the
British Isles, the Survey of Regional English (SuRE) (Kerswill, Llamas and Upton
(forthcoming) and Upton and Llamas (1999)). It can, however, be used in an individual
study of social variation of a given area, in its core form or in an expanded form, as
demonstrated by the methodology used in the Teesside study.
This paper begins with a brief description of the background to the development of
the new methodology, considering problems a multi-levelled data elicitation methodology
1 The project being embarked upon, the Survey of Regional English (SuRE), is a joint project with
funding being sought by a Leeds/Sheffield/Reading axis. Clive Upton, University of Leeds, Paul
Kerswill, University of Reading and John Widdowson, University of Sheffield are co-applicants. This
paper forms part of the methodology chapter from my forthcoming PhD thesis entitled Language
variation and innovation in Teesside English . Thanks go to Dominic Watt, Paul Foulkes and Clive
Upton for many helpful comments on drafts of this paper.
95
A new methodology: data elicitation for language variation studies
poses and the appropriateness, or otherwise, of established methods (section 2). The core
of the new methodology is then presented in section 3, and the additional elements for use in
the study of Teesside, in the north-east of England, are outlined in section 4. The current
paper deals only with the method of data elicitation to be used. All sampling decisions
regarding both the Teesside study and the SuRE project will be dealt with in future papers.
The Teesside study is acting as a pilot for SuRE, however any suggestions for refinement of
the new method of data elicitation are also invited in response to the current paper.
2. Background
The larger picture of the concept of SuRE has necessarily dictated the design of the
new data elicitation methodology presented in this paper. Some understanding, therefore, of
the requirements of the methodology and the ideas behind the concept of SuRE is necessary
for an appreciation of the potential of the new methodology.
2.1 The proposed SuRE project: aims and difficulties
As the Survey of English Dialects (SED) (Orton and Dieth 1962-71), which was
carried out in the 1950s, represents the only consistently-collected nation-wide survey of
dialectal variation in England, a deficiency exists in the knowledge and awareness of current
variation on a national scale. The basic intention of the SuRE project is to create a
computer-held database of consistently-collected material from a planned network of British
localities which will record and document the facts of linguistic variation throughout Britain,
permitting detailed analyses of issues concerning the diffusion of language change and the
spread of current vernacular changes in British English. The form of the survey will be guided
by the necessity for the primary data to be the object of analytical work addressing current
research questions concerning levelling. At the same time, its form must be sufficiently broad
as not to preclude the potential for analysis which addresses other research questions arising
in the future.
In order for the SuRE project to obtain as complete a picture as possible of regional
language variation, data must be obtained which can be analysed on three levels of possible
variation: phonological, grammatical and lexical. To discount any of these levels would be to
obtain an incomplete picture of regional variation in spoken English found throughout Britain.
These multi-levelled data must be comparable across the localities to be studied, permitting
quantitative analyses of the different levels of regional and social variation where possible.
The primary aim of a methodology for the project would be to obtain samples of
informal speech from which analyses can be made at the phonological level and, to some
extent, the grammatical level. As this is a fundamental requirement of a methodology for the
project, a problem lies in combining the level of comparable lexical variation with the
necessity of obtaining natural speech, as to control the lexical items used in a conversation is
to make the interaction less than natural. This control can have the effect of formalising the
speech style, thus hindering the possibility of gaining access to the ‘vernacular’ or ‘the style
in which the minimum attention is given to the monitoring of speech’ (Labov 1972: 208). As
the vernacular ‘gives us the most systematic data for our analysis of linguistic structure’
(Labov 1972: 208), it can be regarded as the style required by the elicitation method of the
SuRE project.
2.2 Previous studies and their applicability
96
Llamas
As a means of eliciting data, the questionnaire has been employed in traditional
dialectology since the nineteenth century and was the ‘fundamental instrument’ of the SED
(Orton and Dieth 1962: 15). Although it proves successful in eliciting lexical and some
grammatical data, it would be entirely inappropriate for a current survey whose intention is
to access and collect samples of informal speech large enough to undertake phonological
analyses which permit quantification.
Additionally, the methods employed by the SED, and by other studies undertaken
within the traditional dialectological paradigm, give scant information on language variation
associated with social factors within a given area, this not being the focus of interest of such
research. Social variables, however, are central to current studies of variation. As such,
many more informants are required from each location than the two or three used in the
SED. Therefore, the methodology for the SuRE project must be relatively quick and easy to
administer, demanding the minimum of the informant’s and the fieldworker’s time, unlike the
lengthy SED questionnaire which contained 9 books of questions each one taking at least 2
hours to complete (Orton and Dieth 1962: 17). Thus, methods which are associated with
traditional dialectological studies of language variation are quite inappropriate to the
proposed SuRE project.
However, methods used to obtain data for research undertaken within a quantitative
paradigm are also inappropriate. Various attempts have been made to access the
vernacular, or the informant’s least overtly careful speech style, for example, the interview
situation in which the fieldworker asks questions to elicit personal narratives (cf. Labov
1972, Trudgill 1974), allowing informants to converse in pairs on topics of their own
choosing with minimal fieldworker involvement (cf. Docherty et al. 1997). Although
successful in obtaining informal speech, these methods almost completely remove the
possibility of obtaining comparable information on lexical variation. The anthropological
technique of participant observation, as used by Cheshire (1982) in Reading and Milroy
(1987a) in Belfast, although successful in gaining quantitative and qualitative data, is also far
too time-consuming for a collaborative project. The wish to access the vernacular, as in
quantitative studies, and the wish to obtain stylistic variation in the speech sample, are central
to the aims of SuRE, however.
Thus, because the data must be elicited quickly and easily, and because lexical
variation must be included, which in turn eliminates the option of ‘free’ conversation, an
interview of some sort must be used in the SuRE methodology. However, a completely
different approach to the elicitation of lexical data than that of the traditional questionnaire is
necessary, as the interview must elicit data which are analysable phonologically and also
grammatically. As the data must be quantifiable (where possible), comparable, analysable on
3 levels of variation and administerable to a relatively large number of informants, a
completely new method of data elicitation and collection is necessary, as no existing data
elicitation technique is entirely suitable or applicable to the needs of the proposed SuRE
project.
3. The new method: the SuRE core
3.1 Overall aims
The primary aims of the new methodology are to obtain informal speech from the
informant (from which multi-levelled analyses of both regionally and socially comparable
data are possible), and to elicit the data as quickly and easily as is possible. A methodology
97
A new methodology: data elicitation for language variation studies
which is perceived to be too complicated or lengthy to administer may result in the
unwillingness of potential fieldworkers to use it.
Although the interview as a speech event is not the ideal means through which to elicit
casual conversation due to the ‘asymmetrical distribution of power suggested by the roles of
questioner and respondent’ (Milroy 1987b: 49), it proves to be the only practical way of
obtaining the necessary data. It is vital therefore to lessen the formality of the interview
situation as much as possible, and to make the interview an unintimidating and, if possible,
enjoyable experience for the informant.
In order to obtain the required informal speech style combined with data on lexical
variation in the interview, the fieldworker ‘leads’ a conversation around semantic fields. To
lessen the formality of the interview context, the interview is undertaken with socially paired
informants, permitting interaction to be more like a conversation than an interview.
Discussion on local lexical items is prompted by the fieldworker, with informants encouraged
to discuss their ‘dialect’ words, how they are used and what connotations they have. As
well as producing informal conversation from which phonological and, to some extent,
grammatical analyses can be made, the ensuing conversation produces a mass of information
on the lexical data produced. This can include age and sex differences in usage,
connotational and collocational information, knowledge and use differentiation of given items
and attitudinal information on dialect.
Although the method of discussing lexical items in pairs produces the sample of
informal speech for analysis, control must still be exercised over the specific lexical items
elicited in order for direct comparisons of variants to be possible.
3.2 Sense Relation Network sheet
The principal tool devised and designed to allow the information on lexical items to be
comparable regionally and socially, and to give a somewhat flexible structure to the
interview, is the Sense Relation Network sheet (SRN). The three SRNs which form the
core of the interview are shown in Figures 1, 2 and 3.
3.2.1 SRNs: visual design and content design
Both the visual design and the content design of the SRNs are inspired by the idea
that there exists a ‘web of words’ (Aitchison 1997: 61), or a series of interconnected
networks which define, delimit and store linguistic expressions in the mind. The visual design
of the SRNs is also inspired by materials and aids used in language teaching, such as words
trees and word field diagrams (see Gairns and Redman 1986), in which visual impact is
crucial.
As can be seen in Figures 1, 2 and 3, visually, networks are designed in which the
standard notion words are connected to subdivisions. The subdivisions, in turn, are
connected to the semantic field of the SRN, symbolising, in a way, the interconnected
network or ‘web of words’. Space is then provided under the standard notion word for the
insertion of a dialectal partial synonym. Each SRN is printed in a different colour (presented
here in black and white), the aim being for the visual impact of the SRNs to be positive and
unthreatening, and for the SRNs to engage the interest of the informant to a level at which
the desire is to complete them.
In terms of content design, the SRNs are built around semantic fields (Lehrer 1974)
and, as such, are akin to the grouping of questions by subject matter in the SED
questionnaire. According to Johnston (1985: 83), the grouping of questions by subject
98
Llamas
99
A new methodology: data elicitation for language variation studies
100
Llamas
101
A new methodology: data elicitation for language variation studies
matter, as opposed to alphabetically or randomly, allows for a level of spontaneity in the
responses. On the SRNs, standard notion words are offered as prompts for the elicitation of
dialectal variants, as interviews which use indirect elicitation techniques are much more time-
consuming than those which use direct ones. Additionally, indirect questioning may make the
interaction feel more like an interview or a test than a conversation, so skewing speech style
towards the formal.
The selection of semantic fields and standard notion words in the 3 SRNs is the result
of trialling and revision of the method during which 8 original SRNs have been subsumed
under the present 3. The subsumption was made in the interests of reducing the time needed
by informants to complete the SRNs, as well as the time necessary to conduct the interview.
None of the initial semantic fields have been discarded entirely, but the fields have become
broader to encompass a greater area of notion words. Standard notion words producing
little or no variation in trialling have been removed. However, each sub-division carries
space for dialectal variants of notion words not included on the SRN which the informant
wishes to include. When selecting standard notion words, the wish to include the same
standard notion word as the SED where possible and appropriate was borne in mind, as a
direct comparison could reveal potential real time change. Due to the urban bias of the
proposed survey and of the study of Teesside English, however, this proved inappropriate in
most cases, with few SED notion words remaining.
The SRNs then, as well as being a visual network, rather than a list of questions,
represent the interrelated network of paradigmatic and syntagmatic sense relations in which
linguistic expressions from similar semantic fields define and delimit each others’ meaning.
They also represent the sense relation of partial synonymy, which the dialectal variant holds
with the standard notion word. Additionally, in time they will represent a geographical sense
relation network of dialectal variation of partial synonyms found throughout the British Isles.
3.2.2 SRNs: technique of administration
Coupled with their concept and design, the technique of administering the SRNs is an
essential part of their success as a method of eliciting lexical data. Informants are given the
SRNs some five days before the interview, with both verbal instructions from the
fieldworker and written instructions as part of the interview pack (see Appendix 2 for
instruction sheet2). The innovatory step of allowing informants to know the content of the
interview prior to the event has implications for both the content of the interview and for the
interview as a speech event.
Giving informants the SRNs prior to the interview allows them time to consider the
lexical items they use. This has a dramatic effect upon the amount of lexical data yielded
from the interview. If asked to produce a dialectal variant as an immediate response to a
prompt, there is a danger of the informant’s mind going blank. This results in minimal data
being yielded. This may also necessitate an undesirable level of prompting from the
fieldworker. More importantly, however, there could be a harmful effect on the required
speech style and the willingness of informants to speak at length, due to a feeling of unease in
the interview situation. Thus, the technique of administering the materials prior to the
interview maximises the amount of data yielded.
2 Note that the instruction sheet shown in Appendix 2 is part of an interview pack used by an informant
from the Teesside study. As such, this carries an additional instruction about the completion of the
Language Questionnaire, which is part of the extended methodology used in Teesside (see section 4.1).
This, and therefore the instruction, do not form part of the core methodology.
102
Llamas
Any feelings of unease in the interview situation may be heightened if the informant
perceives the interview as a test of some sort. By having prior knowledge of the content of
the interview however, it is thought that suspicion on the part of the informant is diminished
considerably. This, combined with the fact of experiencing the interview in a social dyad,
allows informants to settle into a relatively casual speech style in as short a time as possible.
To ensure the ready recruiting of informants and to maximise the possibility of gaining access
to their least overtly careful or monitored speech style, it is crucial that informants feel at
ease and enjoy the interview as much as possible.
When the informants have had some days in which to complete the SRNs at their
convenience, discussing responses with others should they wish (differentiating between their
own and others’ responses on the SRNs), the paired interview is undertaken and recorded
onto minidisc. The interview consists of the written responses on the SRNs being read out
by the informants with responses being discussed in terms of whether informants use the
variants or only know them, situations in which they would be used, connotations and
collocations associated with the variants, as well as anything else which informants might
initiate. The fieldworker can use an interviewer’s guide to ensure that all the notion words
are covered (the informants keep their own SRNs until the end of the interview). The
interviewer’s guide can also contain prompt questions, e.g. the use of intensifiers, gender
differences in use, age differences in use, varying degrees of a state, additional notion words
or senses of the notion words given, all of which can provide additional information and
extend the discussion. During the interview other known or used variants which come up are
noted on the SRNs in different coloured ink by the informant. Thus the written record of the
informant’s responses on the SRNs (which the fieldworker collects after the interview), a
recording of the informant’s spoken responses for pronunciation purposes and a mass of
attitudinal information on the lexical items elicited in an informal speech style are all secured
by means of the recorded interview.
3.2.3 SRNs: data yielded
In terms of lexical items elicited through the SRNs, the richness of the data yielded
can be seen in the 3 completed SRNs which appear as Appendix 3. The potential for the
study of the differences and problematic distinctions between dialectal variants, regional
slang, national slang and standard colloquialisms are clear. The study of nonstandard
orthography is also promoted by the method. Additionally, the difference between items
produced before and items produced during the interview may be of interest.
From the recorded discussion about the responses, more lexical data are produced.3
Informants can use dialectal variants without necessarily being aware they are doing so. For
example, one informant, when discussing the notion word ‘man’, claimed that she would
never use bloke after already having done so during the interview. Additionally, informants
may become aware only when they hear someone else use it that they themselves use a
particular word. Also informants’ insights into which variants are considered to be local, as
opposed to those which are more widely used can be revealed. For example, one informant
claimed not to have inserted a variant for soft shoes worn by children for P.E. because
3 Although the 3 SRNs shown as Appendix 3 present 215 variants for 80 standard notion words, by
including all the variants the informant mentioned but did not write on the SRNs during the interview
and those she claimed knowledge of during the recorded interview, a total of 272 variants were counted
from this informant.
103
A new methodology: data elicitation for language variation studies
she ‘couldn’t think of another word for sandshoes’, indicating that she believed sandshoes
to be a widely used or standard variant.
Once read in isolation lexical items are immediately put into context by the informant.
Thus, the individual lexical item is clearly recorded for transcription purposes and can then
be disregarded for the purposes of a phonological analysis of informal speech (the nature of
the written response on the SRN being read aloud possibly constituting a more formal
reading style of speech). It would, however, be possible and interesting to compare
phonological features of the more formal and less formal styles. The context of the
interaction makes it clear which particular lexical items are read aloud and which are not.
Alternatively, the use of different coloured ink on the SRNs is an indicator of which variants
were written before the interview (and thus read aloud), and which were noted down during
the interview (written after having been spoken). (The latter variants are indicated with an
asterisk in the reproduced SRNs of appendix 3.)
After having been read aloud, the lexical items are generally elaborated upon and
discussed in the context of casual conversation, giving the sample of informal speech which
can be analysed phonologically and grammatically. For example, after having given the
responses twoc, tax, nick, skank, and swipe for the notion word ‘steal’, two informants
went on to discuss at length precisely what each term referred to and their ideas on the
origins of the words. Similarly, sex and age differences in responses to notion words are
discussed at length, with, for example, two young male informants arguing that they would
never use the variant bonny for the notion word ‘attractive’, it being an ‘old person’s’ word,
and they would never use canny-looking, it being used by girls, opting themselves to use
nectar, sweet, fit and lush. Thus, the informal speech which can be analysed phonologically
and grammatically also contains a mass of data on: knowledge and use of lexical items;
attitudinal information on dialectal variants; ideas on word origins; changing societal attitudes
to lexical items and perceptions of and actual sex and age variation in usage. In this way a
multi-levelled bank of data is produced through use of the SRNs.4
3.3 Identification Questionnaire
Combined with the 3 SRNs, an Identification Questionnaire (IdQ) is included in the
interview. The IdQ is given to the informants, with the 3 SRNs, prior to the interview, thus
forming the interview pack. The questions posed in the IdQ of the core interview are listed
below in Figure 4. The IdQ can be expanded for use in a given area as in the Teesside
example, see Appendix 4.
4 Since its initial concept to the development of the method to the stage where it can be used in the
Teesside study, the method of data elicitation has been relatively extensively trialled. As well as being
trialled and revised by myself with 12 informants from Leeds, it has been tried by other researchers in an
external trialling stage of its refinement. It has also been used by students from the University of Leeds
and the University of Basel. Thanks go to Ann Williams, Jason Jones, Mark Jones, Louise Mullany and
Clive Upton for trying the method and giving extremely helpful comments on the effectiveness of the
technique as a method of data elicitation.
104
Add New Comment