
nucleotide substitutions, for example in the third positions of a large percentage of nucleotide triplets.Because of the degeneracy of the genetic code, i.e. inactivated genes, and thus nonfunctional copies of genes, are usually considered to be selectively neutral.The category of neutral mutations apparently also includes a large part of mutations in introns and also part of synonymous mutations, i.e.
#DNA SEQUENCES WITH A HIGH DEGREE OF POLYMORPHISM ARE CODE#
Where n is the number of observed sequences (so that n (n -1)/2 is the number of various pairs of sequences) and Π ijis the number of differences between the i-th and j-th sequence.Īs was mentioned at the beginning of the chapter, the results obtained by the methods of molecular biology indicate that the individual members of a particular species differ in the occurrence of various nucleotides in a great many positions on their genes and thus the gene pool of all species of organisms contains an enormous amount of genetic polymorphism.A considerable part of this polymorphism in the gene pool of a species and the gene pools of the individual populations apparently exists because it is selectively neutral and selectively neutral traits can persist in the population for a very long time.Most selectively neutral traits are eventually eliminated from the population by genetic drift or genetic draft however, mutation processes generate new polymorphisms by that time.Mutations in all parts of the genome that do not code any protein, especially in pseudogenes, i.e.

Where x i is the frequency of the i-th allele in the population.Thus, a population containing a large number of alleles with the same frequency has the largest H value.The average heterozygosity index for the given population can be calculated on the basis of the heterozygosity indices as the arithmetic mean for the individual genes.If the heterozygosity index is calculated on the basis of sequence data, it is also sometimes called the gene diversity index.The nucleotide (aminoacid) diversity index (B) can also be calculated on the basis of sequence data this corresponds to the average number of nucleotide (aminoacid) differences between all the pairs of alleles in the sample divided by the length of the sequences of the relevant alleles.The average number of pair differences Π can be calculated for the whole population or, to be more precise, for the population sample, as The heterozygosity index (H), which is basically the frequency of heterozygotes in the population, is another commonly employed measure of the degree of polymorphism in the population.For the individual genes, this index is usually calculated from the frequencies of the individual alleles: the content of genes that occur in at least two alleles in the studied population.As, if we had a sufficiently large sample of the studied individuals, we would probably find that practically all genes are polymorphic in this sense, only those genes whose commonest allele occurs, for example, in a maximum of 99% of the individuals in the population, are generally considered to be polymorphic.

At the level of the DNA sequence, the degree of polymorphism in the population can be quantified in several ways.It is frequently characterized by the fraction of polymorphic genes, i.e.
