Introduction To Ecological Genomics 2 Edition

by Dick Roelofs and Nico M. van

Introduction To Ecological Genomics 2 Edition Author Dick Roelofs and Nico M van Straalen Isbn 9780199594689 File size 11 2 MB Year 2012 Pages 376 Language English File format PDF Category Biology The genomics revolution has expanded from its origins in molecular biology to impact upon every discipline in the life sciences including ecology Several lines of ecological research can now be profitably addressed using genomics technology including issues of nutrient cycling population structure life history variation trophic interactio

Publisher :

Author : Dick Roelofs and Nico M. van Straalen

ISBN : 9780199594689

Year : 2012

Language: English

File Size : 11.2 MB

Category : Biology



An Introduction to Ecological Genomics

This page intentionally left blank

An Introduction to
Ecological Genomics
Second Edition
Nico M. van Straalen and Dick Roelofs
VU University Amsterdam

1

3

Great Clarendon Street, Oxford ox2 6dp
Oxford University Press is a department of the University of Oxford.
It furthers the University’s objective of excellence in research, scholarship,
and education by publishing worldwide in
Oxford New York
Auckland Cape Town Dar es Salaam Hong Kong Karachi
Kuala Lumpur Madrid Melbourne Mexico City Nairobi
New Delhi Shanghai Taipei Toronto
With offices in
Argentina Austria Brazil Chile Czech Republic France Greece
Guatemala Hungary Italy Japan Poland Portugal Singapore
South Korea Switzerland Thailand Turkey Ukraine Vietnam
Oxford is a registered trade mark of Oxford University Press
in the UK and in certain other countries
Published in the United States
by Oxford University Press Inc., New York
© Nico M. van Straalen and Dick Roelofs 2012
The moral rights of the authors have been asserted
Database right Oxford University Press (maker)
First edition 2006
Second edition published 2012
All rights reserved. No part of this publication may be reproduced,
stored in a retrieval system, or transmitted, in any form or by any means,
without the prior permission in writing of Oxford University Press,
or as expressly permitted by law, or under terms agreed with the appropriate
reprographics rights organization. Enquiries concerning reproduction
outside the scope of the above should be sent to the Rights Department,
Oxford University Press, at the address above
You must not circulate this book in any other binding or cover
and you must impose the same condition on any acquirer
British Library Cataloguing in Publication Data
Data available
Library of Congress Cataloging in Publication Data
Library of Congress Control Number: 2011933690
Cover design by Janine Mariën
Typeset by SPI Publisher Services, Pondicherry, India
Printed in Great Britain
on acid-free paper by
CPI Group (UK) Ltd, Croydon, CR0 4YY
ISBN 978–0–19–959468–9 (Hbk)
ISBN 978–0–19–959469–6 (Pbk)
10 9 8 7 6 5 4 3 2

Preface to the second edition

How fast ecological genomics has moved forward
since the first edition of this book! The few years
that have elapsed since then (2006–2011) have not
only seen the rise of extremely fast, so-called ‘nextgeneration’ DNA sequencing technology, but also a
host of excellent new studies in which genomics
technology has been applied to address ecological
questions. We are particularly impressed by the
massive progress in comparative genomics, phylogenomics, population genomics, and metagenomics. We tried to pay particular attention to the new
frontiers created by these fields as we prepared the
second edition of this book.
In comparison to the previous edition, we
skipped the ‘Genome analysis’ chapter, as the
technology has moved forward to such an extent
that a great deal of that chapter was considered
outdated. Instead, we added a separate section on
methodology and data analysis to Chapter 1, in
which we also treat the new-generation sequencing technologies.
Furthermore, we have included a completely new
chapter on variation and adaptation, in which we
treat the various aspects of genome variability, and
pay a good deal of attention to population genomics, a topic of increasing popularity among population-oriented ecological genomicists. In this chapter
the reader may also find extensive reference to the
issue of neutrality in molecular evolution. In addition we discuss the various aspects of genome archi-

tecture in relation to gene expression and epigenetic
regulation of genes. Altogether we believe that this
new chapter has expanded the scope of the book to
include a wider variety of topics of interest to evolutionary ecologists.
We have retained our ‘problem-orientated’
approach to introduce each chapter, plus an
appraisal section at the end. With this design we
want to emphasize that, in the end, ecological
genomics is just another branch of ecology and that
it addresses questions that all ecologists ask, only
with different technology. On the other hand, it is
our opinion that ecological genomics also brings
new questions to the discourse that were not raised
before or could not be answered by more traditional
ecological approaches.
What was marginally doable in 2005 has now
become an impossibility: to summarize all publications of relevance to ecological genomics in a single
book. We hope that the reader may find guidance
and inspiration in this book to further study more
specialized areas of their interest that we could not
cover.
We hope this book will support graduate programmes of Ecology, Evolutionary Biology, or similar programmes, and stimulate students to proceed
with their career in the exciting field of ecological
genomics, while it is—still—relatively new.
Nico M. van Straalen and Dick Roelofs
Amsterdam, February 2011

This page intentionally left blank

Preface to the first edition

This book is an introduction to the exciting new
field of ecological genomics, for use in MSc courses
and by those beginning their PhD studies.
When we became involved in a national research
programme on ecological genomics, or ecogenomics as it became known, we realized that information on this newly emerging subject needed to be
brought together. In order to start up a research programme in such a new discipline, not only the students, but also we as teachers, had to get to grips
with the subject. Furthermore, although obtaining a
PhD implies mastering a specialized field, the PhD
student must be able to place this field in a broader
context if he or she is to become a mature scientist.
This approach may be called the T-model of education; the horizontal bar of the T representing a broad
understanding, and the vertical bar an investigation
in depth, going down to the root of the problem.
Our book uses this approach.
We assume a basic level of knowledge in the biological sciences to BSc level: ecology, evolutionary
biology, microbiology, plant physiology, animal
physiology, genetics, and molecular biology. We
have tried to link up with the content of the most
common textbooks in these fields, at the same time
realizing that students of ecological genomics have
a variety of backgrounds. However, our main targets are students with subjects closely related to
ecology and evolutionary biology, which is why we
place the emphasis on aspects that we judge to be
particularly new to them.
Evolutionary genomics and bioinformatics are
companion disciplines to ecological genomics. In
the last 10 years interest in both disciplines has
grown enormously. Several textbooks on bioinformatics have already been published and subjects

encompassed by evolutionary genomics, such as
comparative genomics, phylogenetic analysis, and
molecular evolution, can now be considered as
fields in their own right. They are certainly too large
to be covered in an introductory book on ecological
genomics; indeed, evolutionary genomics deserves
a textbook of its own.
We have organized this book around three issues
important in modern ecology, choosing questions
for which the links to genomics are best developed.
At the outset, we perhaps use rather ambitious
phrasing to announce the genomics approach to
these ecological questions. Maybe our questions
cannot be answered at this stage. However, we
decided not to suppress unanswered, and thus
open, issues. Instead we hope to stimulate discussion as well as provide factual information. We have
included an appraisal section at the end of each
chapter to emphasize this question-orientated
approach. Combined with information given in the
introductory section, this allows the reader to grasp
the main points of each chapter, even if the detailed
treatment of molecular principles and case studies
are left aside.
Case studies are taken from literature published
since the year 2000. Nevertheless, a book on genomics runs the risk of becoming outdated very quickly:
the rate at which knowledge is being accrued and
insight developed is unprecedented. However, we
hope that our question-orientated set-up will be
useful for some years to come, even when new and
better case studies are available.
Before this book was written, journal articles
comprised the only literature on ecological genomics. These, although very inspiring, were scattered
widely. Today, most textbooks on genetics and evo-

viii

P R E FAC E TO T H E F I R S T E D I T I O N

lution have a chapter on genomics. Gibson and
Muse published a primer on genome science in
2002, but this did not cover ecological questions. So,
for us, writing this book was ploughing unknown
ground. We have attempted to add structure to the
field, and hopefully have put ecological genomics
on the map. However, we welcome constructive
criticism and suggestions from our readers.
We thank the colleagues who reviewed parts of
the book, suggested issues that had escaped us, or
helped with correcting the English: Martin Feder,
Claire Hengeveld, Jan Kammenga, René Klein
Lankhorst, Bas Kooijman, Jan Kooter, Wilfred

Röling, and Martijn Timmermans. We thank Desirée
Hoonhout and Karin Uyldert for checking the reference list, and Nico Schaefers, for preparation of the
figures. Ian Sherman at Oxford University Press
provided us with stimulating discussion. We thank
members of the Animal Ecology Department at the
Vrije Universiteit for your friendship and encouragement. N.M.vS. also thanks the Faculty of Earth
and Life Sciences of the Vrije Universiteit for granting the sabbatical leave during which most of this
book was written.
Nico M. van Straalen and Dick Roelofs,
Amsterdam, July 2005

Contents

1 Ecological genomics and genome analysis
1.1 The genomics revolution invading ecology
1.2 Yeast, fly, worm, and weed
1.3 -Omics speak
1.4 Genome analysis

1
1
3
9
15

2 Comparing genomes
2.1 Properties of genomes
2.2 Prokaryotic genomes
2.3 Eukaryotic genomes

38
38
52
64

3 Structure and function in communities
3.1 The biodiversity and ecosystem functioning synthetic framework
3.2 Measurement of microbial biodiversity
3.3 Microbial genomics of biogeochemical cycles
3.4 Reconstruction of functions from environmental genomes
3.5 Genomic approaches to biodiversity and ecosystem function: an appraisal

96
96
98
113
129
146

4 Life-history patterns
4.1 The core of life-history theory
4.2 Longevity and aging
4.3 Gene-expression profiles in the life cycle
4.4 Phenotypic plasticity of life-history traits
4.5 Genomic approaches to life-history patterns: an appraisal

148
148
153
167
182
192

5 Stress responses
5.1 Stress and the ecological niche
5.2 The main defence mechanisms against cellular stress
5.3 Heat, cold, drought, salt, and hypoxia
5.4 Herbivory and microbial infection
5.5 Toxic substances
5.6 Genomic approaches to ecological stress: an appraisal

195
195
198
217
226
234
243

6 Variation and adaptation
6.1 The internal tangled bank
6.2 Genomic polymorphisms
6.3 Regulatory and structural change

245
245
247
267
ix

x

CONTENTS

6.4 Epigenetic variation and developmental change
6.5 Genomic approaches to variation and adaption: an appraisal

287
300

7 Integrative ecological genomics
7.1 The need for integration: systems biology
7.2 Ecological control analysis
7.3 Outlook

302
302
307
311

References

315

Index

351

CHAP TER 1

Ecological genomics and
genome analysis

We define ecological genomics as:
a scientific discipline that studies the structure and functioning of a genome with the aim of understanding the
relationship between the organism and its biotic and abiotic environments.

With this book we hope to contribute to this new
discipline by summarizing the developments over
the last ten years and explaining the general principles of genomics technology and its application to
ecology. Using examples drawn from the scattered
literature, we indicate where ecological questions
can be analysed, reformulated, or solved by means
of genomics approaches. This first chapter introduces the main purpose of ecological genomics. We
describe its characteristics, its interactions with
other disciplines, and its fascination with model
species. Then we briefly introduce some of the most
important technologies and the associated data
analysis approaches.

1.1 The genomics revolution invading
ecology
The twentieth century has been called the ‘century
of the gene’ (Fox Keller 2000). It began with the
rediscovery in 1900 of the laws of inheritance by
DeVries, Correns, and Von Tschermak, laws that
had been formulated about 40 years earlier by
Gregor Mendel. With the appearance of the Royal
Horticultural Society’s English translation of
Mendel’s papers, William Bateson suggested in a
letter in 1902 that this new area of biology be called
genetics. The word gene followed, coined by

Wilhelm Ludvig Johannsen in 1909, and then in
1920 the German botanist Hans Winkler proposed
the word genome. The term genomics did not
appear until the mid-1980s and was introduced in
1987 as the name of a new journal (McKusick and
Ruddle 1987). The century ended with the genomics revolution, culminating in the announcement of
the completion of a draft version of the human
genome in the year 2000.
Realizing the importance of Mendel’s papers,
William Bateson announced that genetics was to
become the most promising research area of the life
sciences. One hundred years later one cannot avoid
the conclusion that the progress in understanding
the role of genes in living systems has indeed been
astonishing. The genomics revolution has now
expanded beyond genetics, its impact being felt in
many other areas of the life sciences, including ecology. In the ecological arena, the interaction between
genomics and ecology has led to a new field of
research, evolutionary and ecological functional genomics. Feder and Mitchell-Olds (2003) indicated that
this new multidiscipline ‘focuses on the genes that
affect evolutionary fitness in natural environments
and populations’.
Our definition of ecological genomics given
above seems at first sight to include the basic aim of
ecology, viewing genomics as a new tool for analysing fundamental ecological questions. However,
the merging of genomics with ecology includes
more than the incorporation of a toolbox, because
with the new technology new scientific questions
emerge and existing questions can be answered in a
way that was not considered before. We expect
1

2

A N I N T R O D U C T I O N TO E C O L O G I C A L G E N O M I C S

therefore that ecological genomics will develop into a
truly new discipline, and will forge a mechanistic
basis for ecology that is often felt to be missing. This
could also strengthen the relationship between ecology and the other life sciences, because to a certain
extent ecological genomicists speak the same language
and read the same papers as molecular biologists.
Figure 1.1 illustrates the various fields from which
ecological genomics draws and upon which it is still
growing. First of all, as indicated by Feder and
Mitchell-Olds (2003), ecological genomics is closely
linked to evolutionary biology and the associated
disciplines of population genetics and evolutionary
ecology. Another major area supporting ecological
genomics is plant and animal physiology, which has
its base in biochemistry and cell biology. A special
position is held by microbial ecology, the meeting
place of microbiology and ecology, where the use of
genomics approaches has proceeded further than in
any other subdiscipline of ecology. We consider
genomics itself as a mainly technological advance,
supporting ecological genomics in the same way as
it supports other areas of the life sciences, such as
medicine, neurobiology, and agriculture.
The genomics revolution is not only due to
advances in molecular biology. Three major technological developments that took place in the 1990s

also made it possible: microtechnology, computing,
and communication.
Microtechnology. The possibility of working with
molecules on the scale of a few micrometres, given
by advances in laser technology, has been very
important for one of genomics’ most conspicuous
achievements, the development of the gene chip.
Computing technology. To assemble a genome from
a series of sequences requires tremendous computational power. Extensive calculations are also necessary for the analysis of expression matrices and
protein databases. Without the advent of highspeed computers and data-storage systems of vast
capacity all this would have been impossible.
Communication technology. Consulting genome
databases all over the world has become such normal practice that the scientific progress of any
genomics laboratory has become completely
dependent on communication with the rest of the
World Wide Web. The Internet has become an indispensable part of genomics.
The essence of genomics is that it is the study of
the genome and its products as a unitary whole. In
biology, the suffix -ome signifies the collectivity of
units (Lederberg and McCray 2001), as for example
in coelome, the system of body cavities, and biome,
the entire community of plants and animals in a climatic region. In aiming to investigate many genes
at the same time genomics differs from ecology,

Evolution

No. of genes

Evolutionary
ecology

Microbiology

Microbial
ecology

Ecological
genomics

No. of phenotypes

Population
Genetics
genetics

Physiological
ecology

Plant and animal
physiology

Figure 1.1 The position of ecological genomics in the middle of
the other life-science disciplines with which it interacts most
intensively.

Genomics

Ecology

Figure 1.2 The playing field of ecological genomics, in between
genomics, with its focus on the single genome of a model organism,
studying all the genes that it contains, and ecology, studying a few
genes in many species.

E C O L O G I C A L G E N O M I C S A N D G E N O M E A N A LY S I S

which although investigating many phenotypes,
usually deals with only a few genes at a time (Fig.
1.2). Ecological genomics borrows from these two
extremes, investigating phenotypic biodiversity as
well as diversity in the genome. With this new discipline, ecology is enriched by genomics technology
and genomics is enriched by ecological questioning
and evolutionary views.
Because genomics analyses the genome in its
entirety, it transcends classical genetics, which studies genes one by one, relating DNA sequences to
proteins and ultimately to heritable traits. Genomics
is based on the observation that the impact of one
gene on the phenotype can only be understood in
the context of the expression of several other genes
or, in fact, of all other genes in the genome, plus
their products, metabolites, cell structures, and all
the interactions between them. This is not to say
that every study in genomics deals with everything
all the time, but that the mind is set and tools are
deployed to maximize awareness of any effects
elsewhere in the genome, outside the system under
study. Consequently genomics is invariably associated with unexpected findings. The discovery
aspect of genomics is expressed aptly in a publiceducation project of Genome Canada entitled The
GEEE! in Genome (www.genomecanada.ca).
The work of Spellman and Rubin (2002) and their
discovery of transcriptional territories in the genome
of the fruit fly, Drosophila melanogaster, is an example
of how the genomics approach can fundamentally
alter our way of thinking about the relationship
between genes and the environment (see also
Weitzman 2002). The authors carried out transcription profiling with DNA microarrays (see Section
1.4) to investigate the expression of almost all of the
genes in the fruit fly’s genome under 88 different
environmental conditions. Their work was in fact a
meta-analysis of transcription profiles collected earlier in six separate investigations. Because the complete genome sequence of Drosophila is known, it
was possible to trace every differentially expressed
gene back to its chromosomal position. They concluded that genes physically adjacent in the genome
often had similar expression when comparing different environmental challenges. The window of
correlated expression appeared to extend to 10 or

3

more adjacent genes and they estimated that 20% of
the genome was organized in such ‘expression clusters’. Most astonishingly, genes in one cluster
proved to be no more similar in structure or function than could be expected from a random arrangement. Spellman and Rubin (2002) suggested that
local changes in chromatin structure trigger the
expression of large groups of genes together. Thus a
gene may be expressed not because there is a particular need for its product, but because its neighbour is expressed for a reason completely unrelated
to the function of the first gene. At the moment it is
not known whether such mechanisms lead to unexpected correlations between phenotypic traits, but
surely the discovery of transcriptional territories
could never have been made on a gene-by-gene
basis, and this is due to the genomics approach.

1.2 Yeast, fly, worm, and weed
A striking feature of genomics is its focus on a limited number of model species with fully sequenced
genomes and large research networks organized
around them. The genomes of these model species
have been sequenced completely and the information is shared on the Internet, allowing scientists to
take maximal advantage of progress made by others.
This explains the extreme speed with which the field
is developing. Ecology does not have a strong tradition in standardized experimentation with one species. Thus the genomics approach is all the more
striking to an ecologist, who is often more fascinated
by the diversity of life than by a single organism, and
engaged in a very wide variety of topics, systems,
and approaches. In this section we examine the arguments for introducing model species in ecological
genomics.
The best-known completely sequenced genomes,
in addition to those of mouse and human, are those
of the yeast Saccharomyces cerevisiae, the ‘fly’
Drosophila melanogaster, the ‘worm’ Caenorhabditis
elegans, and the ‘weed’ Arabidopsis thaliana.
Investigations into the genomes of these model
organisms are supported by extensive databases on
the Internet that provide a wealth of information
about genome maps, genomic sequences, annotated
genes, allelic variants, cDNAs, and expressed

4

A N I N T R O D U C T I O N TO E C O L O G I C A L G E N O M I C S

sequence tags (ESTs), as well as news, upcoming
events, and publications. These four model genomes
and their relationships with evolutionary related
species will be discussed in more detail in Chapter 2.
The genomics of the mouse and human are not
discussed at length in this book because the model
status of these two species has mainly a medical
relevance.
The first genome to be sequenced completely was
that of Haemophilus influenzae (Fleischmann et al.
1995). This bacterium is associated with influenza
outbreaks, but is not the cause of the disease, which
is a virus. Although several years earlier the
‘genome’ of bacteriophage ΦX174 had been
sequenced (Sanger 1977a), 1995 is considered by
many as the true beginning of genomics as a science, not in the least because the H. influenzae project
demonstrated the usefulness of a new strategy of
sequencing and assembly (whole-genome shotgun
sequencing; see Section 1.4). With 1.8 Mbp the
genome of H. influenzae was about 10 times larger
than that of any virus sequenced before, but still
two to four orders of magnitude smaller than the
genome of most eukaryotes. Genome sequences of
many other prokaryotes soon followed, including
that of Methanococcus jannaschii, an archaeon living
at a depth of 2600 m near a hydrothermal vent on
the floor of the Pacific Ocean (Bult et al. 1996). The
genome of this extremophile was interesting because
of the many genes that were completely unknown
before. In 1989, a large network of scientists
embarked on a project for sequencing the yeast
genome, which was completed in 1996 and was the
first eukaryotic genome to be elucidated (Goffeau
et al. 1996). Thus, by 1996, the first genomic comparisons were possible between the three domains
of life: Bacteria, Archaea, and Eukarya.
The international Human Genome Project initiated
by the US National Institutes of Health and the US
Department of Energy, was launched in 1990 with
completion due in 2005. However, in the meantime
a private enterprise, Celera Genomics, embarked on
a project with the same aim but a different approach
and actually overtook the Human Genome Project.
The competition was settled with the historic press
conference on 26 June 2000, when US President Bill
Clinton, J. Craig Venter of Celera Genomics, and

Francis Collins of the National Institutes of Health
jointly announced that a working draft of the human
genome had been completed (Fig. 1.3). Many commentators have qualified this announcement as
more a matter of public communication than
scientific achievement. At that time the accepted criterion for completion of a genome sequence, namely
that only a few gaps or gaps of known size remained
to be sequenced and that the error rate was below 1
in 10 000 bp, had not been closely met. The euchromatin part of the genome was not completed until
mid-2004, although that milestone was again considered by some to be only the end of the beginning
(Stein 2004). Nevertheless, the Human Genome
Project can be regarded as one of the most successful scientific endeavours in history and the assembly of the 3.12 billion bp of DNA, requiring some
500 million trillion sequence comparisons, was the
most extensive computation that had ever been
undertaken in biology.
New ultra high-throughput sequencing techniques, also called next-generation sequencing (the
technologies will be explained in Section 1.4), have
caused a second revolution in genome sequencing.
The number of organisms whose genome has been
sequenced completely and published has exceeded
1300 (Liolios et al. 2009). By 2010, no fewer than 188
Archaea, 4800 bacterial organisms, and 1524 eukaryotes were the subject of ongoing genome sequencing projects. Bacteria dominate the list, as the small
size of their genomes makes these organisms wellsuited for whole-genome sequencing.
The list of species with completed genome
sequences does not represent a random choice from
the Earth’s biodiversity. From an ecologist’s point of
view, the near absence of reptiles, amphibians, molluscs, and annelids is striking, as also is the scarcity
of birds and arthropods other than the insects. How
does a species come to be a model in genomics? We
review the various arguments below, asking
whether they would also apply when selecting
model species for ecological studies.
Previously established reputation. This holds for
yeast, C. elegans, Drosophila, mouse, and rat. These
species had already proved their usefulness as models before the genomics revolution and were
adopted by genomicists because so much was

E C O L O G I C A L G E N O M I C S A N D G E N O M E A N A LY S I S

5

Figure 1.3 From left to right: J. Craig Venter (Celera Genomics), President Clinton, and Francis Collins (National Institutes of Health) on the
historic announcement of 26 June 2000 of the completion of a working draft of the human genome. © Win McNamee/Reuters.

known about their genetics and biochemistry and,
perhaps just as importantly, because a large research
community was interested, could support the work,
and use the results.
Genome size. One of the first questions that is
asked when a species is considered for wholegenome sequencing is, what is the size of its
genome? At least in the beginning, a relatively small
genome was a major advantage for a sequencing
project. The genome size of living organisms ranges
across nine orders of magnitude, from 103 bp (0.001
Mbp) in RNA viruses to nearly 1012 bp (1000 000
Mbp) in some protists, ferns, and amphibians (cf.
Fig. 2.1). The puffer fish, Takifugu rubripes, was chosen because of its relatively small genome (oneeighth of the human genome). The issue of genome
size has become less important over the years, due
to the rise of faster and faster sequencing
technologies.
Possibility for genetic manipulation. The possibility
of genetic manipulation was an important reason

why Arabidopsis, Drosophila, and mouse became
such popular genomic models. The ultimate answer
about the function of a gene comes from studies in
which the genome segment is knocked out, downregulated, or overexpressed against a genetic background that is the same as that of the wild type.
Also, the introduction of constructs in the genome
that can report activity of certain genes by means of
signal molecules is very important. This can only be
done if the species is accessible using recombinantDNA techniques. Foreign DNA can be introduced
using transposons; for example, modified P-elements
that can ‘jump’ into the DNA of Drosophila, or bacteria such as Agrobacterium tumefaciens that can transfer a piece of DNA to a host plant. DNA can also be
introduced by physical means, especially in cell cultures, using electroporation, microinjection, or bombardment with gold particles. Another popular
approach is post-transcriptional gene silencing
using RNA interference (RNAi), also called inhibitory RNA expression. The question can be asked,

6

A N I N T R O D U C T I O N TO E C O L O G I C A L G E N O M I C S

should the possibility for genetic manipulation be
an argument for selecting model species in ecological genomics? We think that it should, knowing that
the capacity to generate mutants and transgenes of
ecologically relevant species is crucial for confirming
the function of genes. Ecologists should also use the
natural variation in ecologically relevant traits to
guide their explorations of the genome (Koornneef
2004; Tonsor et al. 2005, see also Chapter 6). A basic
resource for genome investigation can be obtained
by using natural varieties of the study species, and
developing genetically defined culture stocks.
Medical or agricultural significance. Many bacteria
and parasitic protists were chosen because of their
pathogenicity to humans. Other bacteria and fungi
were taken as genomic models because of their
potential to cause plant diseases (phytopathogenicity). Obviously, the sequencing of rice was motivated by the huge importance of this species as a
staple food for the world population (Adam 2000).
Some agriculturally important species have great
relevance for ecological questions; for example, the
bacterium Sinorhizobium meliloti, a symbiont of leguminous plants, is known for its nitrogen-fixing
capacities, but it also makes an excellent model system for the analysis of ecological interactions in
nutrient cycling, together with its host Medicago
truncatula.
Biotechnological significance. Many bacteria and
fungi are important as producers of valuable products, for example antibiotics, medicines, vitamins,
soy sauce, cheese, yoghurt, and other foods made
from milk. There is considerable interest in analysing the genomes of these microorganisms because
such knowledge is expected to benefit production
processes (Pühler and Selbitschka 2003). Other bacteria are valuable genomic models because of their
capacity to degrade environmental pollutants; for
example, the marine bacterium Alcanivorax borkumensis is a genomic model because it produces surfactants and is associated with the biodegradation
of hydrocarbons in oil spills (Röling et al. 2004; Head
et al. 2006).
Evolutionary position. Whole-genome analysis of
organisms at crucial or disputed positions in the
tree of life can be expected to contribute significantly
to our knowledge of evolution. The sea squirt, Ciona

intestinalis, was chosen as a model because it belongs
to a group, the Urochordata, with properties similar
to the ancestors of vertebrates. The study of this
species should provide valuable information about
the early evolution of the phylum to which we
belong ourselves. Methanococcus jannaschii was chosen for more or less the same reason, because it was
the first sequenced representative from the domain
of the Archaea. Many other organisms, although
not on the list for a genome project to date, have a
strong case for being declared as model species for
evolutionary arguments. These include the velvet
worm, Peripatus, traditionally seen as a missing link
between the arthropods and annelids, but now
classified as a separate phylum in the Panarthropoda
lineage (Nielsen 1995), and the springtail, Folsomia
candida, regarded as ancestral to all insects
(Timmermans et al. 2008).
Comparative purposes. Over the last few years,
genomicists have realized that assigning functions
to genes and recognizing promoter sequences in a
model genome can greatly benefit from comparison
with a set of carefully chosen reference organisms at
defined phylogenetic distances. Comparative
genomics is developing an increasing array of bioinformatics techniques, such as synteny analysis,
phylogenetic footprinting, and phylogenetic shadowing
(see Chapter 2), by which it is possible to understand aspects of a model genome from other
genomes. One of the main reasons for sequencing
the chimpanzee’s genome was to illuminate the
human genome, and a variety of fungi were
sequenced to illuminate the genome of S. cerevisiae.
Ecological significance. Since the completion of the
human genome ecological arguments have played
an increasing role in the selection of species for
whole-genome sequencing, and we expect them to
become more important in the future. Jackson et al.
(2002) have formulated arguments for the selection
of ecological model species, and we present them in
a slightly adapted form.
Diversity of physiologies. The new range of models
should embrace diverse phylogenetic lineages, varying in their physiology and life-history strategy.
For example, the model plants Arabidopsis and rice
both employ the C3 photosynthetic pathway. To
complement our understanding of primary produc-

E C O L O G I C A L G E N O M I C S A N D G E N O M E A N A LY S I S

tion, genomic analysis of plants utilizing C4 photosynthesis (e.g. sorghum) or crassulacean acid
metabolism (CAM) will be highly informative.
Considering the diversity of life histories, species
differing in their mode of reproduction and dispersal
capacity should be chosen; for example, hermaphroditism versus gonochorism, parthenogenesis versus bisexual reproduction, and so on.
Ecological interactions. Species that take part in
critical ecological interactions (mutualisms, antagonisms) are obvious candidates for genomic analysis.
One may think of mycorrhizal fungi, nitrogen-fixing
symbionts, pollinators, natural enemies of pests,
parasites, and so on. The most obvious strategy for
analysing such interactions is to sequence the
genomes of the players involved and to try and
understand interactions between them from mutually influential gene expressions. An exciting model
is presented by the pea aphid, Acyrthosiphon pisum,
and its obligate bacterial symbiont, Buchnera aphidicola. The symbiotic bacterium has lost many genes,
of which a few have been transferred to the genome
of its host. Analysis of the two genomes indicates
the presence of extensive exchange of metabolites
between symbiont and host, for instance a shared
amino acid biosynthesis (Perez-Brocal et al. 2006;
Richards et al. 2010).
Suitability for field studies. The wealth of knowledge from experienced field ecologists should play a
role in deciding about new ‘ecogenomic’ models.
Not all species lend themselves to studies of behaviour, foraging strategy, habitat choice, population
structure, dispersal, or migration in the field, simply
because they are too rare, not easily spotted, difficult
to sample quantitatively, impossible to mark and
recapture, not easy to distinguish from related species, or inaccessible to invasive techniques. Thus
suitability for field research is another important criterion. The threespine stickleback, Gasterosteus
aculeatus, and water flea, Daphnia pulex, are considered to be highly suitable for ecological field studies
and have a long standing history of ecological investigation. Recently, genome sequences for both of
these ecological models have become available.
Feder and Mitchell-Olds (2003) developed a similar series of criteria for an ideal model species in

7

evolutionary and ecological functional genomics
(Fig. 1.4). These authors point out that there is currently a discrepancy between classical model species and many ecologically interesting species.
Models such as Drosophila and Arabidopsis are not
very suitable for ecological studies, whereas many
popular ecological models have a poorly characterized genome and lack a large community of investigators. In some cases a large ecological community
is available, but functional genomic studies are
difficult for reasons of quite another nature. For
example, many ecologists favour wild birds as a
study object, but there are ethical objections to
genetic manipulation of such species and laboratory experiments are restricted by law.
Still, we foresee that all the major ecological models will also become genomic models. Using nextgeneration sequencing technologies extremely large
amounts of sequence data can by generated in a
very cost-effective way. The saturation point could
very well be due to the limited number of molecular
ecologists in the worldwide scientific community.
This is not to say, however, that all questions in ecological genomics require the full-length DNA
sequence of a species before they can be answered.
Some issues may prove to be solvable with the use
of less extensive genomic investigations, for example a gene hunt followed by high-throughput quantitative PCR, rather than transcription profiling of
the complete genome (see Section 1.4). In addition,
microarray studies with part of the expressed
genome are possible even in species lacking a complete DNA sequence. Microarrays can be manufactured at costs that are affordable for small research
groups if they are limited to genes associated with a
specific function or response pathway (Held et al.
2004).
Not all ecological models will enjoy the type of
in-depth investigation now dedicated to yeast, fly,
worm, and weed. Murray (2000) points out that the
development of genome-based tools has a strong
element of positive feedback; the rich—that is,
widely studied organisms—get richer and the poor
get poorer. This development has already been felt
in the fields of animal and plant physiology, where
many of the species traditionally investigated in
comparative physiology and biochemistry have

8

A N I N T R O D U C T I O N TO E C O L O G I C A L G E N O M I C S

Infrastructure
• Large, active, and interactive community of
investigators
• Physical and virtual community resources
• Interaction with other basic and applied communities

Gene discovery and
phylogenetic data
• Forward and reverse genetic tools
• Capacity to detect variation,
including differences in transcript
and protein levels
• Known phylogeny to enable, for
example, historical change in
traits of interest to be inferred

Ideal model
species

Molecular data
• Access to genomic sequence and
chromosomal maps
• Upstream regulators and downstream
targets identified for the gene of interest
• Function of gene product known and its
impact on fitness under natural
conditions inferred

Ecological context
• Relatively undisturbed habitats in
the native range of the species
• Observable ecology and behaviour
in nature
• Genetic differentiation causing
local adaption to a range of
abiotic or biotic environments
• Legally protected field sites for
long-term ecological studies

Variation in sequence and phenotype
• Nucleotide variants in natural populations
• Abiotic and biotic environmental factors correlated with
each segregating haplotype
• Evolutionary forces underlying nucleotide variation
inferred from molecular evolution analyses
• Characterized phenotypes under natural conditions for
each variant
• Impact of variants on fitness, abundance, range, and
persistence known
• Structure and dynamics of the natural population known

Figure 1.4 Criteria in evolutionary and ecological functional genomics for a model species, according to Feder and Mitchell-Olds (2003). At
present few species satisfy all criteria. Reproduced by permission of Nature Publishing Group.

been abandoned in favour of models that can be
genetically manipulated to study the function of
genes. Murray (2000) predicted that ‘the larger its
genome and the fewer its students, the more likely
work on an organism is to die’. Crawford (2001) has
argued, however, that functional genomics should
resist this tendency and instead choose species best
suited to addressing specific physiological or biochemical processes. For example, the Nobel Prize
for Medicine was given to H. A. Krebs for his
research on the citric acid cycle, which was conducted on common doves. By modern standards
the dove is a non-model species, but it was chosen
because its breast muscle is very rich in mitochondria. In animal physiology, Krogh’s principle assumes
that for every physiological problem there is a species uniquely suited for its analysis (Gracey and

Cossins 2003). According to this principle, genomic
standard species are likely to be suboptimal for at
least some problems of physiology, because no
model is uniquely suited to answering all
questions.
DNA microarrays, with their associated massive
generation of data on expression profiles (see
Section 1.4), are one of the most tangible features of
modern genomics and are often seen as holding the
greatest promise for solving problems in ecology.
However, not all ecologists are convinced that
microarray-based transcription profiling is the best
way to advance the genomics revolution into ecology. Some authors suggest that microarrays have
already been overtaken by next-generation sequencing methods, which allow gene expression profiles
to be developed from brute force sequencing, rather

E C O L O G I C A L G E N O M I C S A N D G E N O M E A N A LY S I S

than from hybridization (see Section 1.4). Thomas
and Klaper (2004) saw a drawback in the fact that
genome-wide microarrays are available only for
genomic model species, whereas the interest of
ecologists is with species that are important in the
environment and amenable to ecological studies;
these two interests do not necessarily coincide.
Some authors have solved these issues by using
microarrays of model species to profile the transcriptome of non-models. In these cross-species
hybridizations it is assumed that there is sufficient
homology between the non-model and the model to
allow differential expressions to be assessed reliably. For example, rainbow trout can be a reference
for other salmonid fish (Von Schalburg et al. 2005),
and Arabidopsis thaliana may function as a model for
other species of the family Brassicaceae. As an
example of a successful cross-species hybridization
study, Van de Mortel et al. (2006) studied gene
expression in roots of A. thaliana and Thlaspi caerulescens plants grown under deficient and excess zinc
supply. They applied the Agilent Arabidopsis 3
60-mer microarray containing all 27 000 annotated
Arabidopsis genes complemented by 10 000 nonannotated transcripts. Over 2000 genes showed significant differential expression between A. thaliana
and T. caerulescens at each zinc exposure. Many of
these genes appeared to function in metal homeostasis, abiotic stress response, and lignin biosynthesis. Obviously, the success of such experiments is
dependent on the sequence divergence between
model and non-model, although this is not always
decisive (Bar-Or et al. 2007). Cross-species hybridization seems to work best when using long probes
(cDNAs) and is risky in the case of short oligos. It is
always advisable to do a comparative genome
hybridization (CGH) to check for probes on the
array that have insufficient homology across species (Machado et al. 2009).
The use of microarrays in ecology to better understand genetic mechanisms underlying species interactions, adaptations, and evolutionary processes
has increased rapidly (Kammenga et al. 2007). With
the help of next-generation sequencing technology
(see Section 1.4) it is feasible to establish a whole
transcriptome microarray within 12 months of commencing a project; neither the estimated expense

9

nor the availability of technology need be major
obstacles for progress. Given the fact that the
number of completely sequenced organisms is
increasing month by month, we can expect that
within a few years the genomes of almost all ecologically relevant species will be available to be
probed to address a multitude of ecological
questions.

1.3 -Omics speak
Because of the immediately attractive upswing created by the genomics revolution, and the large
financial resources made available in many industrialized countries, adjacent fields of science have
adopted terms echoing genomics, leading to a great
proliferation of designations such as transcriptomics, proteomics, and metabolomics, such that some
biologists have complained that what was molecular biology before is now named after one of the
‘-omics’ but in fact is still molecular biology. Zhou
et al. (2004) proposed a classification of genomics
according to three main categories: approach
(structural or functional), scientific discipline (evolutionary genomics, ecological genomics, etc.), and
object of study (plant genomics, microbial genomics, etc.). An Internet page maintained by Mary
Chitty (Cambridge Healthtech Institute) provides a
glossary of the various terms that have arisen with
the emergence of ‘-omics’ technologies (www.
genomicglossaries.com). There are obvious terms
such as pharmacogenomics and cardiogenomics,
and awkward ones such as saccharomics (the study
of all the carbohydrates in the cell) and vaccinomics (the use of genomics for vaccine development).
The three most common extensions of genomics
are transcriptomics, proteomics, and metabolomics,
and these are introduced briefly here, with reference to Fig. 1.5.
Transcriptomics is the study of all the transcripts
that are present at any time in the cell. In principle
the transcriptome includes messenger RNAs
(mRNAs) in addition to ribosomal RNAs (rRNAs),
transfer RNAs (tRNAs), and small nuclear RNAs
(snRNAs), but transcriptomics is usually limited to
mRNA, the template for translation into protein.
The main activity in transcriptomics is to obtain a

© 2018-2019 uberlabel.com. All rights reserved