A simulation program to illustrate the relationship between relatives

Geneshare is a simulation program, showing what proportion of genes an individual shares with parent, grandparent, brother or sister. It illustrates what happens when chromosomes are passed between parent and offspring, including the effects of crossingover. It also allows the study of inbreeding.

The program will allow you to choose any pair of individuals in a pedigree, and then see what genes, or parts of the chromosome, are shared between the two. Initially you will build up the pedigree yourself, but the program then allows you to quickly generate alternatives. Arrows are provided to navigate through the program.

Because of the limitation of screen size it is not possible to show anything like a complete genome. Two chromosomes are shown instead, sufficient to show the basic principles. The program also uses a simplified recombination mechanism, one recombination point in each arm.


Inbreeding implies the mating of relatives. This section starts with the mating between cousins. It then looks at closer forms of inbreeding, sib-mating and parent-offspring mating, which are of importance in animal breeding.

Fraction shared (mean) Trial

Take a moment to look at the figure above. It shows a 3-generation pedigree. The founder individuals are female parent 1 and male parent 2. They have two offspring, siblings 3 and 4. Each of these has one offspring, so that individuals 7 and 8 are first cousins. Individuals 5 and 6 are shown with much smaller symbols than the others. They are unrelated to the founder individuals and do not play a significant role in the relationships. Note the unconventional lines for showing matings and offspring. Makes it easier to see the passage of genes and chromosomes between generations.

This pedigree is identical to the previous pedigree except for the addition of an offspring from mating of the first cousins.

This pedigree shows (full) sib mating.

Genotypes are now shown. There are two chromosomes (all that can fit). One has a centromere close to the end and the other a centromere close to the middle. Each chromosome set is shown in a different colour, allowing you to follow all regions separately in the following generations.

Genotypes are now shown as previously.

The exercise is now to fill in genotypes for the remainder of the pedigree. Clicking on an individual will produce a gamete, which is shown between the two parent chromosomes. This gamete can then be dragged to an offpring. Once an offspring contains two gametes, it can then produce gametes itself.

Please drag the gamete to one of the offspring.

Please fill in genotypes for the remainder of the pedigree, by clicking on an individual to produce a gamete which can then be dragged to an offspring. Notice that gametes from the female parent go to the first position in the offspring, and gametes from the male parent go to the second position

The pedigree is now complete. We can look at a number of different relationships:
(1) Between parent and offspring
(2) Between grandparent and grandchild
(3) Between siblings (brother and sister)
(4) Between first cousins
(5) Between an individual and aunt/uncle

The pedigree is now complete. We look at what is happening in the inbred individual.

Please select related individuals

The potentially inbred individual is selected. The next screen will show just the regions where the female- and male-derived chromosomes are identical.

Please select an individual to check for effects of inbreeding

(1) Parent - Offspring

Please select two individuals, one a parent and the other its offspring. You can select and unselect by clicking on individuals. PS. This may be easiest, at least initially, if you click on one of the two original parents and an offspring.

(1) Parent - Offspring

Please select two individuals, one a parent and the other its offspring.

Please select two individuals.

Please select an individual.

Good. The next diagram will make it easier to see how much of the genetic material these two have in common, by blanking out regions that are different in the two.

The next diagram will make it easier to check on inbreeding

Use the back arrow to see complete pedigree

The diagram now shows colour only for the regions that are shared between parent and offspring. If you want, you can use the back arrow to see the original colours in these two, and then to see all the original colours.

The diagram now shows colour only for the regions that are shared between parent and offspring. If you want, you can use the back arrow to see the original colours in these two, and then to see all the original colours.

The diagram now shows colour only for the regions that are shared between the two. If you want, you can use the back arrow to see the original colours in these, and then to see all the original colours.

The diagram now shows original colours only for regions where there are two copies.

The diagram now shows original colours only for regions where there are two copies. But in this case there are NONE.

It should be clear that exactly half the genes are shared by parent and offspring. The blocks of shared genes are identical, except for a few recombination events that break them up.
NB Look at the parent to see where the recombination events occur.

As previously, at each point along the chromosome, one gene is shared and one is not.

As with relationships, chance plays a role in determining the extent of identical segments.

Notice that there is now no exact relationship in genes shared between the two individuals. It depends on which chromosome is inherited and where the crossovers are. This is different from the parent-offspring relationship, where the two always share exactly half the genetic material.

You can now nominate to look at other relationships, grandparent-grandchild, brother-sister, etc. Click the forward button to do this. Alternatively you can use the menu to look at different relationships.

Please choose a relationship:

It is important to try new pedigrees to check on the relationship. The program allows you to do this. Clicking the forward button will show the same two individuals but with a new simulated pedigree.

It is important to try new pedigrees to check on the degree of inbreeding. The program allows you to do this. Clicking the forward button will show the same individual but with a new simulated pedigree.

Do you wish to start another relationship? Press back arrow if so, or next arrow to continue

The program now allows you to repeat the simulation, showing just the two indivduals you have selected.


A second set of chromosomes is added automatically, coming from the unrelated parent.

Please select 2 individuals.

Please select just one individual.

(1) Sorry that looks like the wrong pair. Please try again. You can select and unselect by clicking on individuals.


Are you sure that you want to move on from the study of relationships to the study of inbreeding? Press forward arrow to confirm

The exact colours are not important in this program, but if you have some form of colourblindness or have problems with the chosen colours, you can change them here.

Click on one of the chromosomes below. This will activate the 'colourpicker' box, which you can click to try to find distinguishable colours. When you have found a suitable colour, click outside this box to transfer the colour to the clicked chromosome. You can modify as many colours as you would like.

Genetic relationships and inbreeding

You have probably realised by now that the topics of genetic relationships between individuals and inbreeding are very closely related. By its very definition, inbreeding results from the mating between related individuals. The closer the relationship, the greater the amount of inbreeding.

The figure illustrates this for three relationships, including the possibility of self-fertilisation in plants. The inbred individuals and parents are shown in red. The number of generations separating the two parents from their common ancestor(s) is shown in parentheses. The total probability of homozygosity, which is the degree of inbreeding, is shown for offspring. Evidently the more closely related the parents, the higher the inbreeding in the offspring.

Measuring inbreeding - Calculation of the inbreeding coefficient

The usual measure of inbreeding, the inbreeding coefficient (F), can be thought of as simply the fraction of the genome that is identical by descent from a common ancestor or ancestors. In the example above it looks like approximately 0.3.

Inbreeding results from mating of related individuals. The expectation for inbreeding usually considers just a single locus. The general way of picturing this is shown on the right, with an inbred individual and a single ancestor known as a 'common ancestor'.

Because the parents of the inbred individual are related, there must be at least two pathways from a common ancestor. The individual is separated from this common ancestor by n1 and n2 generations respectively, which may be different. Each allele, A1 or A2, has probability ½ of being passed on in each generation. The overall probability of either allele being passed to the individual on both pathways is therefore (½)n1+n2. To take account of the fact that there are two possible alleles, this value must be doubled, giving (½)n1+n2-1.

The overall coefficient of inbreeding may involve multiple pathways from a common ancestor, as well as multiple common ancestors. The inbreeding coefficient is obtained by summing over all such pathways. The calculation for complex pedigrees can be laborious.

Click on the arrow for more screens showing different aspects of inbreeding.

Measuring inbreeding - Variation between individuals

It is clear from the simulations that, by chance, different individuals of the same relationship have differing degrees of inbreeding. The diagram below shows the variation in inbreeding for a set of first cousin mating. Expected inbreeding for first cousins

There are two common ancestors, the two great-grandparents, shown here with the contribution from the female great-grandparent highlighted. Substituting 3 for n1 and n2 from the figure of the previous page gives the contribution to inbreeding as (½)3 + 3 - 1 = 1/32. Summing this with the equal contribution from the male great-grandparent gives the total inbreeding coefficient as 1/16.

The mean is reasonably close to expectation. The variation is very high, with a substantial proportion having no inbreeding at all. This variation is, however, amplified by the small number of chromosomes simulated, 2 in this case. The variation would be substantially less for an increased number of chromosomes, eg 26 in humans.

Measuring inbreeding - Inbreeding depression

Numerous studies dating from those of Charles Darwin have shown that inbreeding is deleterious in organisms that normally outbreed.

What inbreeding does at the genetic level is that it increases the number of homozygous genotypes and decreases the number of heterozygous genotypes. The fact that inbreeding is deleterious means that heterozygous genotypes are, on average, better than homozygous genotypes.

Deleterious recessives
The usually accepted reason for this disadvantage of homozygotes is the existence of deleterious recessive genotypes. There are many cases in humans of genetic diseases such as cystic fibrosis, sickle cell anaemia etc, all of which only occur if the genotype is homozygous. Individually they are rare, presumably because they are deleterious. But there are potentially many genes subject to recessive mutation of this kind. Selection at such loci is usually modelled as follows:
Genotype AA Aa aa
Selective value 1 1 1 - s

Heterozygote advantage
It is still an open question as to whether inbreeding depression is entirely due to deleterious recessive genotypes, or whether at some loci heterozygous genotypes are advantageous over both homozygous genotypes.
Genotype A1A1 A1A2 A2A2
Selective value 1 - s1 1 1 - s2

Measuring inbreeding - DNA and Runs of Homozygosity (ROH) (Advanced topic)

The ease of DNA sequencing has opened up the possibility of directly estimating inbreeding. There are now many species for which a genome sequence is available, and for which runs of homozygosity in an individual can be estimated. Sites in the DNA where differences have been found in populations are denoted as SNPs, Single Nucleotide Polymorphisms, usually with two different bases. These just have to be shown to be homozygous in an individual, ie AA, CC, GG or TT.

A closely related topic is the detection of matched sequences in humans. You may have submitted your DNA to a commercial DNA sequencing company, and found that it was possible to detect relatives by DNA homology. The figure shows an example of the matching of chromosomes for a first-cousin relationship.

ROH can be detected directly from DNA sequences. It may seem surprising that sequences from different individuals, rather than from the same individual, can be matched in the same way, since sequencing detects pairs of DNA bases rather than single DNA bases. On the other hand, sequence matching implies the matching of single DNA strands. As shown in the diagram below, it is nevertheless possible to tell approximately where single-strand matching starts and ends.

Measuring relationships - the complication of inbreeding

In this program, we have measured the degree of relationship as "the fraction of genes shared by two individuals". On the surface, this looks very straightforward. Parent and offspring share half their genes, grandparent and grandchild share one quarter on average, etc. What could be simpler?

One complication comes when we need to take into account both degree of relationship and the possibility of inbreeding. The diagram below shows the results of a parent-offspring mating.

The relationship of interest is the one outlined by the dotted rectangle, between a parent and an inbred offspring. Gene sharing between the two is shown in the following diagram, with unrelated segments removed:

Clearly parent and offspring share more than one-half of their genes in this case.

Measuring relationships - two different measures

(1) Wright's measure. The original measure, due to the population geneticist Sewall Wright (1889 - 1988), is denoted the 'Coefficient of Relationship'. In practice, it is similar to the measure of gene sharing used in this program. But it is based on correlation coefficients, using a rather complex method Wright developed called 'Path Coefficients'.

(2) Malécot's measure. This measure, due to the French mathematical geneticist Gustave Malécot (1911 - 1988) is denoted the 'Coefficient de Parenté'. Malécot's approach is a much simpler probability measure, and it has become the standard method of calculation.

The Coefficient de Parenté
This relates to alleles at a single locus. It is defined as follows for two individuals:
Choose an allele at random from Individual 1
Choose an allele at random from Individual 2
What is the probability that they are identical?

An example
Individual 1 is A1A2
Individual 2 is A1A3
The probability of identity (Coefficient de Parenté) is the probability of drawing the A1 twice, ie ½ x ½ = 0.25.

But these two individuals share half their genes. Wright's coefficient of relationship would be 0.5.

In general, Wright's coefficient is 2 x Malécot's coefficient. See next page for details.

Measuring relationships - a comparison of measures (Advanced topic)

Why use Malécot's measure?
The value of defining the relationship between individuals in this way is that it exactly mirrors the genetic process of producing offspring. Two randomly chosen parental genes combine to produce the offspring genotype. Defined in this way, the relationship between parents is exactly the probability of identity in the offspring, ie. the inbreeding coefficient of the offspring. The method allows for multi-generation calculations of relationship and inbreeding probabilities.

Why use Wright's measure?
This measure makes intuitive sense in defining relationships. Parent and offspring share half their genes. By contrast, Malécot's measure is one-quarter for this relationship. Even more confusingly, the measure of relationship of an individual with itself, or equivalently between identical twins, is one half. Intuitively, it ought to be one.

Can one just double the value to go from one measure to the other?
Almost. For the example above, where the two individuals are A1A2 and A1A3, this rule works, giving ¼ and ½ respectively. Similarly if the two individuals are both A1A2, the measures are ½ and 1.

But what if the genotypes are A1A1 and A1A2? Malécot's measure gives the value ½. In cases such as this involving inbred individuals Malécot's measure works well, even when it is difficult to describe the fraction of genes shared.

Language confusion
How does one translate Malécot's Coefficient de Parenté into English? Alternative translations seem to be 'Coefficient of Relationship' or 'Coefficient of Kinship'. The former translation would directly conflict with Wright's original definition. A Wikipedia site comparing these measures uses Coefficient of Kinship to avoid this problem. There are some problems with this web site, but it does contain a list of expectations for many different relationships.

Drawing of pedigrees

Pedigrees are usually drawn in the form: This method uses horizontal lines to emphasise the mating.

The alternative form, as used in GeneShare, makes it easier to follow the passage of genes from parent to offspring.

The Parent-Offspring pedigree can readily be drawn in the alternative form. Such pedigrees with overlapping generation structure cannot be drawn using just horizontal lines to connect parents.

The use of four colours (1) (Advanced topic)

It is convenient to use four distinguishable colours in the GeneShare pedigrees. This use is, however, misleading if it leads to the impression that colour has anything to do with the function of the gene. In the diagram below, gene regions labelled A are of the same colour, and should be similar, while gene regions labelled B should be different. The reality, of course, is the opposite. Genes of the two regions labelled as A have nothing to do with each other, while genes of B are just 'allelic variants' of each other.

One improvement from this point of view would be to distinguish the two chromosomes. Doing this would require 8 colours for the two parents.

But this would not solve the problem of distinguishing regions within chromosomes

The use of four colours (2)

Here's an attempt to differentially colour different regions for the two parents.
It's not easy to distinguish anything. When non-identical regions are blanked out for parent and offspring, however, the result shows almost the same information as when four colours are used.

In conclusion, the use of four colours is simpler and more informative, with the proviso that colour signifies nothing about chromosomal region.

The special case of siblings (Advanced topic)

The comparison of siblings in the simulations focuses on the identity of single DNA stretches. The diagram below shows an additional comparison, DNA stretches where both stretches are identical.

Note that when parent and offspring are compared, it is not possible for both strands to be identical by descent Sibs share half their genes, as do parent and offspring, so the comparisons should be similar. However if, for example, the region in question contains a recessive gene, both sibs can carry the same recessive phenotype, whereas, in general, parent and offspring do not.

If you study quantitative genetics, you will already know something similar. The correlation between sibs contains a component from the dominance variance, a component that is absent from the parent-offspring correlation.

What is a gene? (Advanced topic)

The term "gene" has been used extensively in this program. Everyone knows what this means??

There is, unfortunately, some debate about the use of the term. A recent article, The Evolving Definition of the Term "Gene", summarises the difficulty of knowing what the term means now. Quoting from the article:
'......genes are not autonomous, independent agents ...... Rather, they exert their effects within...... "genetic regulatory networks" (GRNs).' So it seems that "genes" are out, and "GRNs" are in.

The traditionally used term "locus", a location on the chromosome, is probably still acceptable. So is the term "allele", an alternative (gene?) at a locus. But "allele" also has its limitations when discussing "alleles" at more than one locus. An allele at a first locus and an allele at a second locus are not, strictly speaking, alleles.

In summary, terms such as "gene sharing" might not be acceptable if submitted to a research journal. Hopefully they are understood in the genetic simulations here.