What is the point of group classification in science? The point is predictive power. Inclusion in a group should tell me usefully accurate information about the individual or item that would require an inconvenient amount of work to learn directly.
I can create a group called "coffee mugs". There are the traditional "coffee mug" characteristics: handle, cute art on the side, and being labeled "not dishwasher safe". The characteristic that I really care about is how well it insulates the beverage (i.e., keeps the heat in the coffee and out of my hand).
If the traditional "coffee mug" characteristics predicted good insulation, then inclusion or exclusion
from the "coffee mug" group would be a good tool for predicting the quality of an individual cup. Consider three cups from the Rugbyologist kitchen. Two "coffee mug" members (handle, cute art, not dishwasher safe) and one non-member (no handle, no art, silent on the dishwasher issue). One coffee mug is a great insulator (Mrs. Rugbyologist's favorite). The second is only good as a hand warmer on cold days. The third cup is also a great insulator, making it useful for coffee and frosty, malted beverages.
"Coffee mugs" is useless classification in the Rugbyologist kitchen, because it failed to predict the phenotype of interest.
When it comes to humans, we care about phenotypes like disease risk. A simple classification system that predicts disease risk would be extremely useful. For race to be that classification system, it needs to predict these phenotypes both accurately and efficiently.
While we can use genetic variation to probabilistically identify the geographic ancestry (although not perfectly correlated with traditional race definitions, some commentators use "race" as an equivalent term) of individuals at the resolution of continents or countries, the broad patterns of human genetic variation, however, are consistent with "Out of Africa" migration and genetic drift (Jorde and Wooding [2004] and Tishkoff and Kidd [2004]). In a recent study, Novembre et al. (2008) found that the first two principal components (corresponding roughly to latitude and longitude) of their sub-population clustering explained only 0.45% of the genetic variation in the total population. Only a tiny fraction of the genetic variation available is distributed in a way to create separate sub-groups. Most of the minimal variation
distinguishing between sub-populations has little, if any, fitness effect.
Racial identification has not provided the predictive power we are seeking in the biomedical setting (Jorde and Wooding [2004] and Tishkoff and Kidd [2004]) and appears to have little power to do better in the future.
Perhaps race would produce better predictions if sub-populations could be defined with better resolution. Novembre et al. (2008) used approximately 500,000 genetic markers to cluster 3000+ Europeans by geographic ancestry. They concluded that their data do not indicate discrete populations. If more sequence markers were used, more discrete or smaller sub-populations would certainly be found. Even if
those sub-populations are predictive, the sequence information required will be on the same scale (or worse) with that required to assess an individual's complement of disease related genetic variants.
Directly identifying the disease related genetic variants, the effect size and penetrance of those variants, and the set of variants possessed by an individual is the path to biomedically useful predictions of individual disease risk (Jorde and Wooding [2004]).
Racial group classification is predictive of one thing: inclusion of the individual in that racial group.
Read more from the Rugbyologist here!
I can create a group called "coffee mugs". There are the traditional "coffee mug" characteristics: handle, cute art on the side, and being labeled "not dishwasher safe". The characteristic that I really care about is how well it insulates the beverage (i.e., keeps the heat in the coffee and out of my hand).
If the traditional "coffee mug" characteristics predicted good insulation, then inclusion or exclusion
from the "coffee mug" group would be a good tool for predicting the quality of an individual cup. Consider three cups from the Rugbyologist kitchen. Two "coffee mug" members (handle, cute art, not dishwasher safe) and one non-member (no handle, no art, silent on the dishwasher issue). One coffee mug is a great insulator (Mrs. Rugbyologist's favorite). The second is only good as a hand warmer on cold days. The third cup is also a great insulator, making it useful for coffee and frosty, malted beverages.
"Coffee mugs" is useless classification in the Rugbyologist kitchen, because it failed to predict the phenotype of interest.
When it comes to humans, we care about phenotypes like disease risk. A simple classification system that predicts disease risk would be extremely useful. For race to be that classification system, it needs to predict these phenotypes both accurately and efficiently.
While we can use genetic variation to probabilistically identify the geographic ancestry (although not perfectly correlated with traditional race definitions, some commentators use "race" as an equivalent term) of individuals at the resolution of continents or countries, the broad patterns of human genetic variation, however, are consistent with "Out of Africa" migration and genetic drift (Jorde and Wooding [2004] and Tishkoff and Kidd [2004]). In a recent study, Novembre et al. (2008) found that the first two principal components (corresponding roughly to latitude and longitude) of their sub-population clustering explained only 0.45% of the genetic variation in the total population. Only a tiny fraction of the genetic variation available is distributed in a way to create separate sub-groups. Most of the minimal variation
distinguishing between sub-populations has little, if any, fitness effect.
Racial identification has not provided the predictive power we are seeking in the biomedical setting (Jorde and Wooding [2004] and Tishkoff and Kidd [2004]) and appears to have little power to do better in the future.
Perhaps race would produce better predictions if sub-populations could be defined with better resolution. Novembre et al. (2008) used approximately 500,000 genetic markers to cluster 3000+ Europeans by geographic ancestry. They concluded that their data do not indicate discrete populations. If more sequence markers were used, more discrete or smaller sub-populations would certainly be found. Even if
those sub-populations are predictive, the sequence information required will be on the same scale (or worse) with that required to assess an individual's complement of disease related genetic variants.
Directly identifying the disease related genetic variants, the effect size and penetrance of those variants, and the set of variants possessed by an individual is the path to biomedically useful predictions of individual disease risk (Jorde and Wooding [2004]).
Racial group classification is predictive of one thing: inclusion of the individual in that racial group.
Read more from the Rugbyologist here!




