Genetic relationships and inbreeding
You have probably realised by now that the topics of genetic relationships between individuals
and inbreeding are very closely related.
By its very definition, inbreeding results from the mating between related individuals.
The closer the relationship, the greater the amount of inbreeding.
The figure illustrates this for three relationships, including the possibility of self-fertilisation in plants.
The inbred individuals and parents are shown in red. The number of generations
separating the two parents from their common ancestor(s) is shown in parentheses.
The total probability of homozygosity, which is the degree of inbreeding, is shown for offspring.
Evidently the more closely related the parents, the higher the inbreeding in the offspring.
Measuring inbreeding - Calculation of the inbreeding coefficient
The usual measure of inbreeding, the inbreeding coefficient (F), can be thought of as simply
the fraction of the genome that is identical by descent from a common ancestor or ancestors.
In the example above it looks like approximately 0.3.
Inbreeding results from mating of related individuals.
The expectation for inbreeding usually considers just a single locus.
The general way of picturing this is shown on the right, with
an inbred individual and a single ancestor known as a 'common ancestor'.
Because the parents of the inbred individual are related, there must be at least two pathways from a common ancestor.
The individual is separated from this common ancestor by n1
generations respectively, which may be different.
Each allele, A1
has probability ½ of being passed on in each generation.
The overall probability of either allele being passed to the individual on both pathways is therefore
. To take account of the fact that
there are two possible alleles, this value must be doubled, giving
The overall coefficient of inbreeding may involve multiple pathways from a common ancestor,
as well as multiple common ancestors. The inbreeding coefficient is obtained by summing over all such pathways.
The calculation for complex pedigrees can be laborious.
Click on the arrow for more screens showing different aspects of inbreeding.
Measuring inbreeding - Variation between individuals
It is clear from the simulations that, by chance, different individuals of the same relationship have differing
degrees of inbreeding. The diagram below shows the variation in inbreeding for a set of first cousin mating.
Expected inbreeding for first cousins
There are two common ancestors, the two great-grandparents, shown here
with the contribution from the female great-grandparent highlighted.
Substituting 3 for n1
from the figure of the previous page gives
the contribution to inbreeding as
(½)3 + 3 - 1
Summing this with the equal contribution from the male great-grandparent gives the total inbreeding coefficient as 1/16.
The mean is reasonably close to expectation. The variation is very high, with a substantial proportion having
no inbreeding at all. This variation is, however, amplified by the small number of chromosomes simulated, 2 in this case.
The variation would be substantially less for an increased number of chromosomes, eg 26 in humans.
Measuring inbreeding - Inbreeding depression
Numerous studies dating from those of Charles Darwin have shown that inbreeding is deleterious
in organisms that normally outbreed.
What inbreeding does at the genetic level is that it increases the number of homozygous genotypes
and decreases the number of heterozygous genotypes.
The fact that inbreeding is deleterious means that
heterozygous genotypes are, on average, better than homozygous genotypes.
The usually accepted reason for this disadvantage of homozygotes is the existence of deleterious recessive genotypes.
There are many cases in humans of genetic diseases such as
cystic fibrosis, sickle cell anaemia etc, all of which
only occur if the genotype is homozygous.
Individually they are rare, presumably because they are deleterious.
But there are potentially many genes subject to recessive mutation of this kind.
Selection at such loci is usually modelled as follows:
||1 - s
It is still an open question as to whether inbreeding depression is entirely due to
deleterious recessive genotypes, or whether at some loci heterozygous genotypes are
advantageous over both homozygous genotypes.
||1 - s1
||1 - s2
Measuring inbreeding - DNA and Runs of Homozygosity (ROH) (Advanced topic)
The ease of DNA sequencing has opened up the possibility of directly estimating inbreeding.
There are now many species for which a genome sequence is available, and for which
runs of homozygosity in an individual can be estimated.
Sites in the DNA where differences have been found in populations are denoted as SNPs
olymorphisms, usually with two different bases.
These just have to be shown to be homozygous
in an individual, ie AA, CC, GG or TT.
A closely related topic is the detection of matched sequences in humans.
You may have submitted your DNA to a commercial DNA sequencing company, and found that
it was possible to detect relatives by DNA homology.
The figure shows an example of the matching of chromosomes for a first-cousin relationship.
ROH can be detected directly from DNA sequences. It may seem surprising that sequences from
different individuals, rather than from the same individual, can be matched in the same way,
since sequencing detects pairs of DNA bases rather than single DNA bases.
On the other hand, sequence matching implies the matching of single DNA strands.
As shown in the diagram below, it is nevertheless possible to tell approximately where single-strand
matching starts and ends.
Measuring relationships - the complication of inbreeding
In this program, we have measured the degree of relationship as "the fraction of genes shared by two individuals".
On the surface, this looks very straightforward. Parent and offspring share half their genes,
grandparent and grandchild share one quarter on average, etc.
What could be simpler?
One complication comes when we need to take into account both degree of relationship and the possibility of inbreeding.
The diagram below shows the results of a parent-offspring mating.
The relationship of interest is the one outlined by the dotted rectangle, between a parent and an inbred offspring.
Gene sharing between the two is shown in the following diagram, with unrelated segments removed:
Clearly parent and offspring share more than one-half of their genes in this case.
Measuring relationships - two different measures
(1) Wright's measure. The original measure, due to the population geneticist Sewall Wright (1889 - 1988),
is denoted the 'Coefficient of Relationship'.
In practice, it is similar to the measure of gene sharing used in this program.
But it is based on correlation coefficients, using a rather complex method Wright developed
called 'Path Coefficients'.
(2) Malécot's measure.
This measure, due to the French mathematical geneticist Gustave Malécot (1911 - 1988) is
denoted the 'Coefficient de Parenté'.
Malécot's approach is a much simpler probability measure, and it has become the standard method of calculation.
The Coefficient de Parenté
This relates to alleles at a single locus. It is defined as follows for two individuals:
Choose an allele at random from Individual 1
Choose an allele at random from Individual 2
What is the probability that they are identical?
Individual 1 is A1A2
Individual 2 is A1A3
The probability of identity (Coefficient de Parenté) is the probability of drawing the A1 twice,
ie ½ x ½ = 0.25.
But these two individuals share half their genes. Wright's coefficient of relationship would be 0.5.
In general, Wright's coefficient is 2 x Malécot's coefficient. See next page for details.
Measuring relationships - a comparison of measures (Advanced topic)
Why use Malécot's measure?
The value of defining the relationship between individuals in this way is that it exactly mirrors the
genetic process of producing offspring. Two randomly chosen parental genes combine to produce the offspring genotype.
Defined in this way, the relationship between parents is exactly the probability of identity in the offspring,
ie. the inbreeding coefficient of the offspring.
The method allows for multi-generation calculations of relationship and inbreeding probabilities.
Why use Wright's measure?
This measure makes intuitive sense in defining relationships. Parent and offspring share half their genes.
By contrast, Malécot's measure is one-quarter for this relationship. Even more confusingly, the measure of
relationship of an individual with itself, or equivalently between identical twins, is one half.
Intuitively, it ought to be one.
Can one just double the value to go from one measure to the other?
Almost. For the example above, where the two individuals are A1A2
, this rule works, giving ¼ and ½ respectively.
Similarly if the two individuals are both A1A2
, the measures are ½ and 1.
But what if the genotypes are A1A1
? Malécot's measure gives the value ½.
In cases such as this involving inbred individuals Malécot's measure works well,
even when it is difficult to describe the fraction of genes shared.
How does one translate Malécot's Coefficient de Parenté into English?
seem to be 'Coefficient of Relationship' or 'Coefficient of Kinship'.
The former translation would directly conflict with Wright's original definition.
A Wikipedia site
comparing these measures uses Coefficient of Kinship to avoid this problem. There are some problems
with this web site, but it does contain a list of expectations for many different relationships.
Drawing of pedigrees
Pedigrees are usually drawn in the form:
This method uses horizontal lines to emphasise the mating.
The alternative form, as used in GeneShare,
makes it easier to follow the passage of genes from parent to offspring.
The Parent-Offspring pedigree can readily be drawn in the alternative form.
Such pedigrees with overlapping generation structure cannot be drawn using just horizontal lines to connect parents.
The use of four colours (1) (Advanced topic)
It is convenient to use four distinguishable colours in the GeneShare pedigrees.
This use is, however, misleading if it leads to the impression that colour has anything to do
with the function of the gene. In the diagram below, gene regions labelled A are of the same colour,
and should be similar, while gene regions labelled B should be different.
The reality, of course, is the opposite. Genes of the two regions labelled as A have nothing to do with each other,
while genes of B are just 'allelic variants' of each other.
One improvement from this point of view would be to distinguish the two chromosomes.
Doing this would require 8 colours for the two parents.
But this would not solve the problem of distinguishing regions within chromosomes
The use of four colours (2)
Here's an attempt to differentially colour different regions for the two parents.
It's not easy to distinguish anything.
When non-identical regions are blanked out for parent and offspring, however,
the result shows almost the same information as when four colours are used.
In conclusion, the use of four colours is simpler and more informative, with the
proviso that colour signifies nothing about chromosomal region.
The special case of siblings (Advanced topic)
The comparison of siblings in the simulations focuses on the identity of single DNA stretches.
The diagram below shows an additional comparison, DNA stretches where both
Note that when parent and offspring are compared, it is not possible for both strands to be identical by descent
Sibs share half their genes, as do parent and offspring, so the comparisons should be similar.
However if, for example, the region in question contains a recessive gene, both sibs can
carry the same recessive phenotype, whereas, in general, parent and offspring do not.
If you study quantitative genetics, you will already know something similar.
The correlation between sibs
contains a component from the dominance variance, a component that is absent from the parent-offspring correlation.
What is a gene? (Advanced topic)
The term "gene" has been used extensively in this program. Everyone knows what this means??
There is, unfortunately, some debate about the use of the term. A recent article,
The Evolving Definition of the Term "Gene"
the difficulty of knowing what the term means now.
Quoting from the article:
'......genes are not autonomous, independent agents ......
Rather, they exert their effects within......
"genetic regulatory networks" (GRNs).'
So it seems that "genes" are out, and "GRNs" are in.
The traditionally used term "locus", a location on the chromosome,
is probably still acceptable. So is the term "allele", an alternative (gene?) at a locus.
But "allele" also has its limitations when discussing "alleles" at more than one locus.
An allele at a first locus and an allele at a second locus are not, strictly speaking, alleles.
In summary, terms such as "gene sharing" might not be acceptable
if submitted to a research journal. Hopefully they are understood in the genetic simulations here.