Mutation rates

FTDNA Marker Mutation Rates

Conrad W. Terrill, 4 Sept. 2010

A number of efforts have been made to characterize the mutation rates of FTDNA's sixty-seven yDNA markers, not all of which have produced consistent results. For our purposes, though, in DOR-Terrill's YDNA Roger project, accurate quantitative determination does not appear to be too important. Some of our Modal Roger1 group members have exactly the same yDNA as did Roger Terrill, who was born perhaps 350 years earlier. And some members have as many as three mutations, in 67 markers. We've encountered problems, though, in interpreting and understanding results, now that we have ten members who have been tested to 37 or more markers. The last two sets of results, for JHTurrell and WDTerrill, defied our previous conclusion that CDYb is a mutation indicator for the Thomas2 line. Before launching into an explanation for that, let's build up a better understanding of these 67 markers.

All of these markers are for Short Tandem Repeats (STRs), meaning that a short sequence of DNA base pairs, like AGT for instance, is repeated some number of times. The marker value can be considered to be the repeat count, for our purposes (it's more complex for some markers). You might notice that some of the markers appear to be related; e.g., 385a and 385b, 389-1 and 389-2; 464a, 464b, 464c and 464d. One of these, 389-1 and 389-2, is a special case. Marker 389-1 is the first part of the count for 389, and marker 389-2 is the count for the entirety of 389. So if 389-1 is bumped up by one via a mutation, 389-2 will be too. But an increment in 389-2 does not imply an increment in 389-1, since it could be the second part which is incremented. In all the other cases (for those markers ending in a, b, c or d), the situation is different. 385a and 385b, for instance, are two markers in different locations on the Y-chromosome, but it's the same short base pair sequence which is repeated for both. The testing process is not able to distinguish which is which, so the two are listed in increasing order of the count values. The one with the lower value is designated 385a, and the higher is 385b. Likewise for 459a and b, 464a through d, YCAIIa and b, CDYa and b, 395S1 a and b, and 413a and b. All of this has not created much of a problem for us yet, except with regard to CDYa and b, since we've seen no mutations yet in any of the other multiple-markers. There's more to our problem with CDYa and b, though. The main problem is that these two markers mutate much more rapidly than any others.

There are many subtleties to mutation rates. Some researchers and surname project administrators contend that the mutation rates are different in different paternal lines. Dennis Garvey pointed out in a RootsWeb Genealogy-DNA posting that Kayser et al. found that trinucleotide YSTRs (STRs with three bases in the repeat unit) behave differently from tetranucleotide YSTRs (with four bases in the repeat unit—and, by the way, the majority of YSTRs are tetranucleotide): Trinucleotide YSTRs mutated either much more slowly or much more rapidly. Kayser speculated that what mattered was the absolute length of the YSTR (the number of bases in the repeat unit times the number of repeats). Garvey added that marker 388 with a value of 13, for instance, could mutate much more slowly than the same marker if it had a value of 16. It's not necessary that we understand all the subtleties, but it's worth keeping in mind that things are not simple.¹

The graph below shows the FTDNA marker mutation rates, as given in a table compiled by Leo Little.² What I've plotted are actually the inverses of the mutation rates, since I think they convey more meaning. (Would you rather deal with "0.001 mutations per generation," or "1000 generations per mutation"?) "1000 generations per mutation" means that, on average, in a paternal line extending many thousands of generations (if you can imagine such a thing), we would see one mutation every 1000 generations, give or take. Note that the number of generations per mutation below is plotted on a logarithmic axis, since the values span several orders of magnitude. And keep in mind that the faster-mutating markers are those with lower "generations per mutation" value.

Note: Click on the graph to open a full-resolution version in a new window.

The main purpose of this graph is to illustrate just how much more rapidly CDYa and CDYb mutate, compared to all the other markers. The actual value for both is 28 generations per mutation, as given in Little's table. The next most rapidly mutating marker is 576, at 98 generations per mutation. Note that we will very likely never see a mutation in our Modal Roger1 group in those markers which mutate once every 10,000 or more generations. In fact, we have yet to see a mutation in a marker which mutates once every 1000 or more generations. Here's a list of those mutations which we've actually seen in our group, so far, along with the mutation rates given in Little's table:

Marker (DYS)	# mutations observed	Generations per mutation

389-2	1	413
447	1	379
448	1	741
GATA H4	1	481
456	1	136
607	1	243
576	1	98
CDY (a or b)	( 5 )	28
511	1	783
534	1	120
520	1	408

It should be clear now why we've seen so many CDY mutations. It's been ten generations since Roger1, on average, for those of us tested (counting RTerrill of England as one of us), so roughly (10/28)ths of the ten of us should show the CDYa or b mutations. This turns out to be 3.6 of us, while 5 of us actually do show the mutations. Considering the small number of samples the agreement is as good as one can expect.

We're no longer very certain that CDYb is an indicator mutation of the Thomas2 line. CWTerrill, JRTerrill and WDTerrill are all descendants of Thomas2, and only CWTerrill and JRTerrill show the mutation. It could turn out that the mutation occurred independently in each of their lines, or it could turn out that the mutation did in fact occur at Thomas2, and that a mutation back occurred in WDTerrill's line. This mutation back would have had to have occurred since Moses7, since he's the MRCA of 4th cousins WDTerrill and CWTerrill.

Henceforth YDNA Roger will treat CDYa and CDYb marker values differently from the others. They can still be useful within branches, to distinguish one part of a branch from another. But they are not as useful as the other markers as key branch point indicators in Roger1's descendant tree. It will be interesting to see the role played by the CDYs as more descendants get tested and the tree develops.

References

1. A Dennis Garvey post to RootsWeb Genealogy-DNA, on 27 Oct. 2004.

2. Leo Little's table of mutation rates:
http://freepages.genealogy.rootsweb.ancestry.com/~geneticgenealogy/ratestuff.htm.