MOLECULAR MARKERS AND GENOME SEQUENCING IN CROP IMPROVEMENT. Molecular techniques, in particular the applications of molecular markers, have been used to scrutinize DNA sequence variation(s) in and among the crop species and create new sources of genetic variation by introducing new and favorable traits from landraces and related crop species. Molecular markers What are they? A marker is a gene or piece of DNA with easily identified phenotype such that cells or individuals with different alleles are distinguishable. For example a gene with a known function or a single nucleotide change in DNA Or A readily detectable sequence of DNA or protein whose inheritance can be monitored However, alternative methods, such as the construction of partial maps and combination of pedigree and marker information, have also proved useful in identifying marker/trait associations. A revision of current crop improvement methods by utilizing molecular markers in breeding programs is, therefore, crucial in the present scenario. Molecular markers and genome sequencing. • Map and sequence the genomes of model organism - The bacterium E. coli (4.6 Mb) - The yeast S. cerevisiae (12 Mb) - The roundworm C. elegans (100 Mb) - The fruit fly D. melanogaster (180 Mb) - The mouse M. musculus (3,000 Mb) Bacteriophage fX174, the first genome to be sequenced, is a viral genome with only 5,368 base pairs (bp). Fred Sanger invented "shotgun" sequencing, a strategy based on the isolation of random pieces of DNA from the host genome to be used as primers for the PCR amplification of the entire genome. The amplified portions of DNA are assembled in silico by their overlapping regions to form contiguous transcripts (otherwise known as contigs). The final step involved using custom primers to elucidate gaps between contigs thus giving the completely sequenced genome (‘walking’). Sanger used "shotgun" sequencing five years later to complete a bacteriophage l sequence that was significantly larger ( 48 Kbp). This method allowed sequencing projects to proceed at a much faster rate, expanding the scope of realistic sequencing ventures. Since then other viral and organellar genomes have been sequenced using similar techniques Map Based Sequencing Two alternatives were used to sequence the human genome. The BAC-to-BAC method, employed by the DOE and NIH funded HGP, is slow because it depends on mapping the genome ot be sequenced and obtaining sets of partially ordered, overlapping BACs. Also referred to as the map-based method, it was developed from procedures used in individual labs in the late 1980s and 90s. BAC to BAC Sequencing The BAC to BAC approach first creates a crude physical map of the whole genome before sequencing the DNA. Constructing a map requires cutting the chromosomes into large pieces and figuring out the order of these big chunks of DNA before taking a closer look and sequencing all the fragments. VENTER’S SHOTGUN Whole genome shotgun sequencing is a much faster approach, and enabled researchers to speed up the timetable for sequencing enormously. The shotgun method was developed by J. Craig Venter and his associates in 1996 when he was at the Institute for Genomic Research (TIGR). DNA based markers: Markers based on the differences in the DNA profiles of the individuals To avoid problems specific to morphological, biochemical markers, the DNA-based markers have been developed. They are highly polymorphic, simple inheritance (often co-dominant), abundant occurrence through out the genome, easy and fast to detect, minimum pleiotropic effect and detection independent on the developmental stage of the organism. Numerous markers have been used to map different chromosomes in several crops including rice, wheat, maize, soybean and several others. These markers have been used in diversity analyses, parental detection, DNA fingerprinting, and prediction of the hybrid performance. Molecular markers are useful in indirect selection processes, enabling manual selection of the individuals for further propagation. The introduction of the Polymerase chain reaction (PCR) has enabled the development of powerful genetic markers. However, most recent DNA-based markers are classified into three basic categories depending upon the techniques that are used: Hybridization based (Non-PCR) markers Restriction Fragment Length Polymorphism (RFLP) RFLP was developed in late 70’s after the discovery of restriction endonucleases (REs) from the bacteria. RE acts as molecular scissor to cut DNA molecules at a specific sequence, for example EcoRI recognizes the sequence, GAATTC. DNA genome of pine tree restricted by EcoRI can generate 5 million different restricted fragments. Restriction fragment length polymorphism (RFLP) was the first DNA marker technology used. The first true RFLP map in a crop plant (tomato) was constructed in 1986 with 57 loci (Bernatzky and Tanksley, 1986). Polymerase Chain Reaction (PCR) based markers: Randomly Amplified Polymorphic DNA (RAPD): The RAPD technique became popular in 1990s. It is a PCR based technique. However, RAPD is different from conventional PCR as it needs one primer only for amplification. The size of the primer is normally short (10 nucleotides), and therefore, less specific. The primers can be designed without the knowledge of any genetic information for the organism being tested (Rafalski, 1997). More than 2000 different RAPD primers are available commercially. Genomic DNA normally has complimentary sequences to RAPD primers at many locations. If two of these locations are close to each other (<3000bp), and the sequences are in opposite orientation, the amplification can be established. This amplified region is said as a RAPD locus. Normally, a few (3-20) loci can be amplified by a single RAPD primer Microsatellites: Microsatellites, or Simple Sequence Repeats (SSRs), are polymorphic loci present in nuclear or other organelle DNA (mitochondrial, chloroplastic DNA), consisted of repeating units of 1-6 base pairs in length (Turnpenny and Ellard, 2005). These are typically neutral, co-dominant and are used as molecular markers having wide-ranging applications in the field of genetics. The microsatellite DNA is the only molecular marker to provide clues about the more closely related alleles (Goldstein et al., 1995). The SSRs are considered as the markers of choice for self-pollinated crops with little intraspecific polymorphism (Roder et al., 1998). What are microsatellites? These markers are capable of detecting high levels of inter- and intra-specific polymorphism, particularly when tandem repeats number is one hundred or greater (Queller et al., 1993). The repeated sequence is often simple, consisting of two, three or four nucleotides (di-, tri-, and tetra-nucleotide repeats, respectively), and can be repeated 10 to 100 times. Majority sequences are in non-coding regions (to be continued)