Grape (Vitis vinifera L.) BAC Library Construction,
Preliminary STC Analysis, and Identification of Clones
Associated with Flavonoid and Stilbene Biosynthesis
J. P. Tomkins,1* D.G. Peterson,2 T.J. Yang,1 D. Main,1 E.R Ablett,3 R.J. Henry,3 L.S. Lee,3
T.A. Holton,3 D. Waters,3 and R.A. Wing1
We have constructed a grape BAC library using the cultivar Syrah. The library contains 55,296 clones stored
in 144 384-well microtiter plates. A random sampling of 381 BACs indicated an average insert size of 144 kb
with a range of 30 to 355 kb, and less than 4% of the clones do not contain inserts. Eighty-nine percent of
BAC clones in the library have an average insert size greater than 100 kb. Based on a genome size of 483 Mb,
library coverage is 16.5 haploid genome equivalents. Screening the BAC library colony filters with cpDNA
sequences showed that contamination of the genomic library with chloroplast clones was low (1.5%). Library
screening of an 11X coverage (2 BAC colony filters) with 12 cDNA probes corresponding to flavonoid and
stilbene biosynthesis genes resulted in an average of 13 hits per probe (range = 1 to 27). To gain a glimpse
into the grape genome and evaluate the library for sequence-tagged connector (STC) development, 768 BAC
clones were end sequenced in both forward and reverse directions. The STCs were queried against the
SWISS-PROT database and significant hits were sorted according to putative function.
Key words: Grape, BAC library, BAC end sequencing, flavonoid and stilbene biosynthesis
Grape (Vitis vinifera L.) is one of the most important horti-
the BAC system is very appealing as a vehicle for advanced
cultural fruit crops in the world . Rapid progress in cultivar
genome analysis in grape. At the present time, there are no pub-
improvement through traditional breeding methods is hampered by
lished reports of BAC libraries for grape.
extensive labor, funding, and other aspects required for cul-
In the current study, we report the development of a BAC
tivation. Furthermore, our understanding of grapevine genetics is
library for the grape cultivar Syrah, an important red wine
limited, primarily due to the lack of inbred lines required for
germplasm that contains genes of interest in regard to flavonoid
extensive genetic study. Therefore, molecular breeding ap-
and stilbene biosynthesis. The library was characterized with
proaches provide additional tools in the development and sub-
chloroplast DNA sequences and cDNAs associated with fla-
sequent selection of novel grape cultivars.
vonoid and stilbene biosynthesis. A preliminary STC database
In recent years, the molecular mapping of the grape genome
for the grape genome based on the sequencing of 1,536 BAC
has been accomplished using dominant PCR-based markers and a
ends was also developed. Bioinformatic analysis of the STCs
pseudotestcross approach to linkage analysis [4,7]. These maps are
provides new insights into grape genome structure.
able to provide the necessary guide for the physical mapping
and cloning of genes associated with important horticultural
Materials and Methods
BAC library construction. A new BAC vector, pCUGI-1
The map-based cloning of important genes requires the de-
was used and prepared as described by Luo et al. . Megabase
velopment of large insert genomic libraries. The bacterial artifi-
plant DNA embedded in agarose plugs was obtained as described
cial chromosome (BAC) cloning system appears to offer advan-
by Peterson et al. , using the method for plant tissue
tages over other large insert cloning systems [11,18]. Therefore,
digests of megabase DNA (using HindIII, size selections, and
ligations were performed as described in detail by Peterson et
1Clemson University Genomics Institute, Clemson, SC 29634; Department of Botany,
al. . Re-combinant
of Georgia, Athens, GA 30602; 3Centre for Plant Conservation Genetics,
(Genetix Corp., Queensway, UK) and stored
University, PO Box 157, Lismore NSW 2480, Australia.
"Corresponding author: [Email: email@example.com]
individually in 144 384-well microtiter plates. Three copies of
Acknowledgments: Appreciation for technical assistance during library replication and filter
the library were made and stored in separate -80°C freezers.
production is extended to David Frisch and Michael Atkins and to Scott Tingey and the DuPont
Corporation for providing cDNAs to screen the BAC colony filters. This research was supported
BAC clone characterization. To prepare BAC DNA, 3 mL
by funds from the Centre for Plant Conservation Genetics, Southern Cross University, PO Box
157, Lismore, NSW 2480, Australia. The library is the property of Southern Cross University &
LB chloramphenicol (12.5 ug /uL) cultures were grown over-
users must complete a materials transfer agreement.
night in six-cell autogen tubes and miniprepped robotically
Manuscript submitted July 2001; revised October 2001
Copyright © 2001 by the American Society for Enology and Viticulture. All rights reserved.
(Autogen 740 plasmid isolation system, Framingham, MA). To
Am. J. Enol.Vitic. 52:4 (2001)
288 — Tomkins et al.
estimate insert size and determine distribution of clone size, a
line lysis miniprep techniques. Sequencing reactions were set up
total of 384 BAC preps were performed from clones selected at
according to the manufacturer instructions for the Big Dye
random throughout the library. Due to DNA sample loss, 381
Terminator chemistry (Applied Biosystems, Foster City, CA).
preps were used for restriction analyses. The BAC DNA was
Reactions were performed using forward and reverse universal
digested with 7.5 units (10 hr at 37°C) of NotI and analyzed by
primers. Samples were loaded onto 48-lane sequencing gels in
pulsed field electrophoresis in 1% agarose gels (6 v/cm, 5 to 15
ABI377 automated sequencers. Gels (250 mL) were composed
sec switch time, 15 hr run time, 14°C). Southern blots of size-
of the following: 5% Long Ranger (FMC), 6M urea, TEMED
separated BAC inserts were performed using standard protocols
18 uL, 150 uL ammonium persulfate (10% stock), and 1x TBE
 after UV nicking the gels (Gene Linker, Bio-Rad, Hercules,
buffer. Reaction products were electrophoresed using a 3.5 hr
CA). Total genomic grape DNA for use as probe was extracted
run. Base-calling was performed automatically using PHRED
from Syrah plants using the DNAzol® ES extraction protocol
[5,6], and vector sequences were removed by CROSS-MATCH
for plants (Molecular Research Center, Cincinnati, OH) and 32P
labeled using standard random priming techniques .
sequences (defined as those having >100 nonvector bases with a
PHRED quality value >20) were used as queries in FASTX
BAC library screening. High-density colony filters for hy-
searches of the SWISS-PROT database . All software was run
bridization based screening of the library were prepared using
locally on a Sun Ultra30 workstation using Solaris 2.6. The
the Genetix Q-bot. Clones were gridded in double spots using a
4x4 array on 22.5 cm square Hybond N+ filters (Amersham
grape BAC end sequences have been submitted to GenBank
Corp., Buckinghamshire, UK). This gridding pattern allows
18,432 clones to be represented per filter. Colony filters were
treated and hybridized using standard techniques . Radiola-
beling (32P) of probe DNA and hybridization of colony filters
was performed using standard techniques . Screening for
BAC library construction and characterization. We have
chloroplast DNA in the library used three barley chloroplast
constructed a BAC library for the red winegrape cultivar Syrah
clones containing ndhA (470 bp), rbcL (1,300 bp), and psbA
that is suitable for physical mapping, DNA sequencing, and cloning
(1,400 bp) sequences. These sequences are spaced equidistant
genes associated with key horticultural traits. HindIII was used
around the 133 kb barley chloroplast genome. Chloroplast clones
as the cloning enzyme. The library consists of 55,296 clones stored
were obtained from J. DuBell (Dept. Biochemistry and Biophysics,
in 144 384-well microtiter plates. Less than 4% of the clones do
Texas A&M University, College Station). Screening was also
not contain inserts as judged by random analysis of BACs
performed with 12 cDNA clones associated with flavonoid and
sampled from the library. A random sampling of 381 BACs
stilbene biosynthesis as indicated in Table 1. A summary of this
taken from the library indicated an average insert size of 144
pathway and the various genes involved is presented by Sparvoli
kb, with a range of 30 to 355 kb. Based on a haploid genome
et al. . The cDNA clones used for the BAC filter hybridiza-
size of 483 mb , the coverage of the library is approximately
tions were obtained from a grape EST project managed by the
DuPont Corporation (Wilmington, DE). The putative function
probability of recovering any specific sequence. Figure 1 shows
of each cDNA was determined based on sequence similarity.
23 randomly selected clones digested with NotI to release the
BAC end sequencing. Preparation of BAC DNA for end
insert. The two NotI sites in pCUGI-1 flank the multicloning site.
sequencing was done in a 96-well format using standard alka-
Because NotI is a GC-8-base cutter and the grape genome is rela-
tively AT rich, digestion typically generates a vector band plus
one insert band.
Table 1 Grape BAC library hybridization results using 12 cDNA probes
To determine the size distribution of BAC clones in the li-
associated with anthocyanin biosynthesis. Two high-density BAC
brary, the 381 BACs analyzed with Notl digests were grouped
colony filter arrays were used for each probing, allowing the screening
by insert size, and the insert size of each clone was plotted against
of 11 haploid genome equivalents.
the frequency of each group of clones represented in the library
(Figure 2). Based on this analysis, 89% of the clones in the li-
CHS (chalcone synthase)
brary have an average insert size greater than 100 kb. Of the
clones larger than 100 kb, 65% are greater than 125 kb.
StSy (Stilbene synthase)
DFR (dihydroflavonol 4-reductase)
To obtain an estimate of the representation of chloroplast
CHR (Chalcone reductase)
DNA in the library, colony filters were screened with three dif-
F3H (flavanone 3-hydroxylase)
ferent chloroplast genes. Results from this screening showed that
ANS (Anthocyanin synthase)
approximately 1.5% of library sequences are chloroplast DNA
CH1 (chalcone isomerase)
(data not shown). The low chloroplast DNA content of the li-
F3'H (Flavonoid 3',5'-Hydoxylase )
brary is likely due to the use of nuclei as a megabase DNA source
PAL (Phenylalanine ammonia lyase)
rather than protoplasts [14,15,19].
FLS (Flavonol synthase)
To test the library for coverage and isolate genomic regions
C4H (Cinnamate 4-hydroxylase)
associated with stilbene and flavonoid biosynthesis, screening
F3'H (Flavonoid 3'5'-Hydoxylase (3'))
of high-density BAC colony filters was performed using 12 grape
cDNAs identified as representing genes associated with the stil-
Am. J. Enol. Vitic. 52:4 (2001)
Grape BAC Library Construction — 289
Figure 1 Analysis of 23 randomly selected grape
BAC clones. Ethidium bromide stained CHEF gel (5
to 15 sec switch time, 14 hr) showing insert DNA
above the common 7.5 kb pCUGI-1 vector
band. Molecular weight marker in outside lanes is a
50 kb lambda concatamer.
data set of 1,031 sequences was developed. High-quality
nonredundant sequences were searched against the SWISS-
PROT database using the FASTX algorithm. A probability cutoff
value (E value) of at least 10-6 was used to assign putative
identities to the STCs. The SWISS-PROT search resulted in 110
(11%) of the sequences showing similarity to sequences of
known function. Significant search results were then sorted into
eight different functional categories (Table 2). Many of the STCs
shared sequences similar to retrotransposons and constituted a
significant component of the data set (41%). The next largest
categories of STCs were those involved in metabolism/photo-
synthesis (25%), structural roles (14%), regulatory roles (9%),
<50 50- 76- 101- 126- 151- 176- 210- 226- >250 75
100 125 150 175 200 225 250
ribosomal (6%), cell defense, communication, division (3%), and
hypothetical proteins (1%).
Insert size (kb)
The highly significant STCs showing similarity to various
Figure 2 Insert size distribution of BAC clones in the grape Syrah
plant proteins other than retroelements were extracted from the
BAC library. To estimate insert size range, BAC DNA from 381 ran-
data set and are listed in Table 3. Of these STCs, 18% were best
domly selected clones were analyzed, as shown in Figure 1. Results
indicate that the average insert size is 144 kb, with over 89% of the
matches to Arabidopsis thaliana proteins, likely because both
clones > 100 kb.
are dicots and Arabidopsis is the most-studied plant species
whose genome has been sequenced. The hits on Arabidopsis
proteins contained a variety of functions in the metabolic, regu-
bene and flavonoid biosynthesis pathway (Table 1). The filter
latory, structural and ribosomal categories. Other plant species
sets used for the cDNA-probes consisted of two filters repre-
giving multiple best matches included a variety of dicots and
senting 36,864 clones or 11 haploid genome equivalents. An
monocots: Nicotiana tabacum (10%), Zea Mays (10%), Glycine
average of 13 positive signals were obtained for the 12 probes,
max (8%), Lycopersicum esculentum (8%), Oryza sativa (8%),
with a range of 1 to 27. The wide range of positive signals iden-
and Solanum tuberosum (8%). The other putative plant protein
tified between probes is likely indicative of the effects of pref-
functions were based on matches from 11 other diverse plant
erential cloning obtained from the use of a restriction enzyme
to generate the inserts. Nevertheless, an adequate number of posi-
tive clones were obtained for all probes, except for one cDNA
that only produced one positive signal. The average number of
hits was slightly above the expected estimate of 11 positive sig-
Table 2 Results from query of grape BAC end sequences against the
nals per probe, suggesting that some of the cDNAs may have
SWISS-PROT database. Significant hits (N = 110, E < 1x10-06) were
targeted duplicated regions of the genome or members of gene
categorized according to function and listed in descending order with
families. The BAC clones identified from this effort will pro-
largest group first.
vide the full gene sequence including regulatory regions for
many of the key proteins involved in flavonoid and stilbene bio-
BAC end sequencing. To examine the feasibility of using a
STC strategy  to establish a framework for sequencing se-
lected regions of the grape genome, we sequenced and analyzed
the ends (forward and reverse) of the first 768 clones in the li-
Cell defense, communication, and division
brary. After editing, redundancy was evaluated by querying the
sequences against themselves, with the result that a nonredundant
Am. J. Enol. Vitic. 52:4 (2001)
290 — Tomkins et al.
was demonstrated by screening the BAC colony filters with 12
cDNAs representing key enzymes in the flavonoid and stilbene
Here we describe the development and characterization of a
biosynthesis pathway. The flavonoid biosynthetic genes were
high-quality BAC library for the red winegrape cultivar Syrah.
chosen because they control important traits such as disease re-
This large insert library provides an important resource for map-
sistance, fruit color, and production of secondary metabolites,
based cloning of genes, physical mapping, and DNA sequencing
which provide benefits to human health. Most of these genes
in this valuable fruit crop. The library has been deposited in the
have been shown to be present in single or low copy number in
Clemson University Genomics Institute BAC/EST Resource Center
many plant species, including grape . Therefore, the pres-
and is publicly available upon signing an material transfer
ence and frequency of these genes within the grape BAC library
agreement with the Centre for Plant Conservation Genetics.
provides some indication as to the coverage of the library. CHS
Requests for high-density BAC colony filter arrays and clones
genes are generally present in higher copy number than other
can be submitted through the Clemson University Genomics
flavonoid genes, and it has been shown that StSy genes are
Institute web page (www.genome.clemson.edu).
present in greater numbers . Probes used to screen the BAC
The utility of the library for identifying important regions of
library were from expressed gene sequences and may have cross-
the grape genome associated with color development in the berry
hybridized to multiple gene family members of high sequence
Table 3 The highly significant hits (E < 1x10-06) for plant proteins (excluding retroelement and proline-rich proteins) from the SWISS-
PROT query of grape BAC end sequences. Entries are sorted alphabetically by best match plant species and secondly by significance.
End sequences have been submitted to GenBank under accession numbers BH1676222746551 to BH1686692747586.
BAC end ID
Best match protein
Best match plant species
ATP synthase alpha chain
60S ribosomal protein L21
Calnexin homolog precursor
Glycine-rich cell wall protein
Histidine transporter protein
Protein kinase TMK1
Potential heme binding protein
Early nodulin 75 precursor
Delta cadenine synthase isozyme XC14
23KDA Jasmonate induced protein
Heat shock 70KDA protein
Water stress induced tonoplast protein
Plasma membrane ATPase proton pump
60S ribosomal protein L2
Magnesium chelatase subunit
Fructose bisphophate aldolase
40S ribosomal protein S19
60S ribosomal protein L5
Glycine-rich cell wall protein
Photosystem II Chlorophyll A
60S ribosomal subunit L5
Photosystem II 10KDA
Photopsystem II 10KDA
Chloroplast 30S ribosomal protein 30S
60S Acidic ribosomal protein
Tubulin beta-1 chain
Am. J. Enol. Vitic. 52:4 (2001)
Grape BAC Library Construction — 291
similarity. The results, however, were indicative of the good rep-
7. Lodhi, M., M. Daly, G. Ye, N. Weeden, and B. Reisch. A
resentation provided by the library, as an average of 13 hits per
molecular marker based linkage map of Vitis. Genome 38:786-794
probing were obtained when an 11X filter representation was
8. Luo, M., Y. Wang, D. Frisch, T. Joobeur, R. Wing, and R.
Dean. Melon BAC library construction using improved methods and
The BAC library screening results also suggest a close physical
identification of clones linked to the locus conferring resistance to
linkage of the StSy and CHS genes in the grape genome: 19 BAC
melon Fusarium Wilt (Fom-2). Genome 44:154-162 (2001).
clones hybridized to CHS only, 7 hybridized to StSy only, and 8
9. Peterson, D., J. Tomkins, D. Frisch, R. Wing, and A.
hybridized to both CHS and StSy. Tropf et al.  have
Paterson. Construction of plant bacterial artificial chromosome (BAC)
libraries: An illustrated guide, J. Agric. Genomics. Vol. 5.
suggested that StSy genes have developed from CHS genes sev-
eral times in the course of evolution. The close physical asso-
10. Sambrook, J., E. Frisch, and T. Maniatus. Molecular Cloning:
ciation of CHS and StSy genes in grape may be the result of a
A Laboratory Manual. 2d ed. Cold Spring Harbor Laboratory Press, Cold
CHS gene duplication event, giving rise to a StSy gene.
Spring Harbor, NY (1989).
End sequencing of 1,536 BAC clones produced a
11. Shizuya, H., B. Birren, U. Kim, V. Mancino, T. Slepak, Y.
nonredundant high-quality set of 1,031 STCs. Search results
Tachiiri, and M. Simon. Cloning and stable maintenance of 300-
kilobase-pair fragments of human DNA in Escherichia coli using an F-
against the SWISS-PROT database gave 110 significant hits (> 1
factor-based vector. Proc. Nat. Acad. Sci. U.S.A. 89:8794-8797
x 10-06) that were sorted according to function. Not surprisingly,
many of the matches were to retroelement related sequences
12. Sparvoli, F., C. Martin, A. Scienza, G. Gavazzi, and C.Tonelli.
(41%). This is actually a fairly low level of retroelement content
Cloning and molecular analysis of structural genes involved in
for a dicot. In a preliminary survey of BAC end sequences for
flavonoid and stilbene biosynthesis in grape (Vitis vinifera L). Plant
tomato, a higher level (48%) of retroelement matches were found
Mol. Biol. 24:743-755(1994).
. Because of the small genome size for grape and the low level
13. Tinlot, R., and M. Rousseau. The state of viticulture in the world
of retroelement content, the grape BAC library is ideally suited
and the statistical information in 1992. Bull. OIV 66:861-946 (1993).
for the development of an STC framework.
J.L.Goicoechea, H.T. Knapp, and R.A. Wing. A soybean bacterial
artificial chromosome library for PI 437654 and the identification of
clones associated with cyst nerrwtode resistance. Plant Mol. Biol.
1. Arumuganthan, K., and E. Earle. Nuclear DNA content of
some important plant species. Plant Mol. Biol. Rep. 9:208-218
15. Tomkins, J., Y. Yu, H. Miller-Smith, D.A. Frisch, S.S. Woo, and
sugarcane.Theor. Appl. Genet. 99:419-424(1999).
sequence database and its supplement TrEMBL. Nucleic Acids Res.
16. Tropf, S., T. Lanz, S.A. Rensing, J. Schroder, and G.
Schroder. Evidence that stilbene syntheses have developed from
3. Budiman, A., L. Mao, T. Wood, and R. Wing. A deep-
chalcone synthases several times in the course of evolution. J. Mol.
coverage tomato BAC library and prospects toward development of
an STC framework for genome sequencing. Genome Res. 10:129-136
17. Venter, C., H. Smith, and L. Hood. A new strategy for
genome sequencing. Science 381:364-366 (1996).
4. Dalbo, M., G. Ye, N. Weeden, H. Steinkellner, K. Sefc, and B.
18. Woo, S.S., J. Jiang, B. Gill, A. Paterson, and R.A. Wing.
Reisch.. A gene controlling sex in grapevines placed on a molecular
Construction and characterization of a bacterial artificial chromosome
marker-based genetic map. Genome 43:333-340 (2000).
library of sorghum bicolor. Nucleic Acids Res. 22:4922-4931 (1994).
5. Ewing, B., and P.Green. Base-calling of automated sequencer
19. Yu, Y, J.P. Tomkins, R. Waugh, D.A. Frisch, D.Kudrna, A.
traces using phred. II. Error probabilities. Genome Res. 8:186-194
Kleinhofs, R. Brueggeman, G. Muehlbauer, R.Wise, and R. Wing. A
bacterial artificial chromosome library for barley (Hordeum vulgare L.)
6. Ewing, B., L. Hillier, M.C. Wendl, and P. Green. Base-calling
and the identification of clones containing putative resistance
genes. Theor. Appl. Gen. 101:1093-1099(2000).
assessment. Genome Res. 8:175-185 (1998).
Am. J. Enol. Vitic. 52:4 (2001)