- Home
- Module 6: Genetics, Evolution and Ecosystems
- Manipulating Genomes
Manipulating Genomes¶
Part of Module 6: Genetics, evolution and ecosystems.
The ability to read, copy and edit DNA sequences has transformed both medicine and biology. This topic covers the key techniques: sequencing (reading the order of bases), amplification (copying a specific sequence), electrophoresis (separating by size), and genetic engineering (inserting sequences from one organism into another). Each technique has direct medical and commercial applications, but also raises ethical questions that are part of the specification.
Learning Objectives¶
| ID | Official specification wording | Main teaching sections |
|---|---|---|
6.1.3-lo-1 |
(a) the principles of DNA sequencing and new DNA sequencing techniques (b) (i) how gene sequencing has allowed for genome-wide comparisons between individuals and between species (ii) how gene sequencing has allowed for the sequences of amino acids in polypeptides to be predicted (iii) how gene sequencing has allowed for the development of synthetic biology (c) the principles of DNA profiling and its uses (d) the principles of the polymerase chain reaction (PCR) and its application in DNA analysis |
DNA Sequencing, DNA Profiling |
6.1.3-lo-2 |
(e) the principles and uses of electrophoresis for separating nucleic acid fragments or proteins (f) (i) the principles of genetic engineering (ii) the techniques used in genetic engineering |
Genetic Engineering |
6.1.3-lo-3 |
(g) the ethical issues (both positive and negative) relating to the genetic manipulation of animals (including humans), plants and microorganisms (h) the principles of, and potential for, gene therapy in medicine. |
Gene Therapy |
DNA Sequencing¶
The Sanger Chain-Termination Method¶
The chain-termination (dideoxy) method, developed by Frederick Sanger, was the first widely used DNA sequencing technique and remains the basis of modern sequencing.
Principle: Modified nucleotides (dideoxynucleotides, ddNTPs) lack the 3'-OH group needed to add the next nucleotide. When a ddNTP is incorporated into a growing chain, extension stops. By using a mixture of normal nucleotides and a small proportion of fluorescently labelled ddNTPs, a population of fragments of every possible length is generated.
Procedure: 1. The DNA sample to be sequenced is denatured into single strands 2. A reaction mixture is prepared containing: - Single-stranded template DNA - Primer (complementary to the template sequence) - Four standard dNTPs - DNA polymerase - A small proportion of each of the four fluorescently labelled ddNTPs (each base labelled with a different colour) 3. DNA polymerase extends the primer; when a ddNTP is incorporated, extension stops, generating a fragment of a specific length 4. Many reactions occur simultaneously, producing fragments of every possible length (terminating at every possible base) 5. Fragments are separated by high-resolution capillary gel electrophoresis — shorter fragments migrate faster, separating fragments that differ by a single base 6. A laser detects the fluorescent label on the terminal ddNTP of each fragment 7. The sequence is read from the electropherogram (the colour and order of fluorescent peaks)
High-Throughput (Next-Generation) Sequencing¶
Modern sequencing has evolved from Sanger's single-reaction method to massively parallel systems that sequence millions of fragments simultaneously. Key features: - DNA is fragmented into short pieces (~100–300 bp) - All fragments are sequenced at once (parallelisation) - Computational assembly reconstructs the whole genome from overlapping fragments - Whole human genomes can now be sequenced in hours rather than years
Applications of DNA Sequencing¶
| Application | Description |
|---|---|
| Evolutionary comparisons | Comparing gene or genome sequences between species reveals evolutionary relationships; molecular phylogenetics; the more similar the sequence, the more closely related the species |
| Medical diagnosis | Sequencing to identify disease-causing mutations in individual patients |
| Personalised medicine | Genome-wide sequencing identifies variants that influence drug metabolism or disease susceptibility; therapies can be tailored to an individual's genome |
| Prediction of protein sequence | The amino acid sequence of a protein can be inferred from its coding DNA sequence |
| Synthetic biology | Designed DNA sequences can be synthesised chemically and inserted into organisms to produce novel functions |
Genomics, Bioinformatics and Proteomics¶
Advances in sequencing have produced enormous datasets requiring computational analysis:
- Genomics: the study of whole genomes using DNA sequencing and computational biology. The Human Genome Project (completed 2003) mapped the entire human genome and made the data publicly available.
- Bioinformatics: the development of software, computing tools and mathematical models to store, retrieve and analyse biological data (nucleotide sequences, protein sequences, gene expression data).
- Computational biology: uses bioinformatics tools and biological data to model biological systems.
- Proteomics: the study of the complete set of proteins expressed by a genome (the proteome). The proteome is more complex than the genome because of alternative splicing and post-translational modification: one gene can give rise to multiple protein variants.
- DNA barcoding: comparing a short standardised DNA sequence from an unknown organism to a reference database to identify the species or establish evolutionary relationships.
Sequencing pathogen genomes is also valuable for: identifying sources and transmission routes of disease outbreaks; detecting antibiotic-resistant strains; developing new vaccines and drug targets.
DNA Profiling¶
DNA profiling (also called DNA fingerprinting or genetic fingerprinting) is used to identify individuals or determine genetic relationships. It exploits regions of non-coding, repetitive DNA that vary widely between individuals:
- Variable number tandem repeats (VNTRs) (also called minisatellites): repeated DNA sequences where the number of repeats varies between individuals. They are heritable, located across the genome, and have no protein-coding function. With the exception of identical twins, no two individuals share identical VNTR patterns.
- Short tandem repeats (STRs) (also called microsatellites): shorter repeated sequences (2–6 bp) that are used in modern profiling because they can be amplified accurately by PCR from small or degraded samples.
A high similarity in VNTR or STR patterns between two individuals indicates they are likely closely related. The probability of two unrelated individuals sharing the same profile across 10–13 STR loci is approximately 1 in 10¹³.
Polymerase Chain Reaction (PCR)¶
PCR amplifies a specific target DNA sequence exponentially. It is essential before profiling because biological samples (blood, hair, saliva) often contain only tiny amounts of DNA.
Components of a PCR reaction: - Template DNA (the sample to be amplified) - Primers (short, single-stranded oligonucleotides complementary to either end of the target sequence) - Free deoxynucleotide triphosphates (dNTPs) - Heat-stable DNA polymerase (e.g. Taq polymerase, from Thermus aquaticus) - Buffer solution
PCR cycle (repeated ~30 times):
| Step | Temperature | What happens |
|---|---|---|
| Denaturation | ~95 °C | Hydrogen bonds between base pairs break; double-stranded DNA → two single strands |
| Annealing | 50–65 °C (primer-dependent) | Primers bind (anneal) to their complementary sequences at each end of the target sequence |
| Extension | ~72 °C | Taq polymerase adds nucleotides from the 3' end of each primer, synthesising new complementary strands |
After 30 cycles: 2³⁰ ≈ 10⁹ copies of the original sequence.
Gel Electrophoresis¶
Gel electrophoresis separates DNA (or protein) fragments by size using an electric current.
Procedure: 1. Agarose gel is prepared (a porous matrix that acts as a molecular sieve) 2. DNA samples are loaded into wells at one end of the gel 3. An electric current is applied; DNA fragments are negatively charged (due to phosphate groups) and migrate towards the positive electrode 4. Smaller fragments move faster and further; larger fragments move slower and not as far 5. Fragments are visualised using ethidium bromide (intercalates into DNA; fluoresces orange under UV light) or SYBR Green, or radioactive labels 6. A DNA ladder (marker with known fragment sizes) is run in a separate lane for size comparison
DNA profiling using STRs compares the pattern of bands produced after PCR amplification of multiple STR loci. The probability of two unrelated individuals having the same profile across 10–13 STR loci is vanishingly small (approximately 1 in 10¹³).
Uses of DNA profiling: - Forensic identification (match crime-scene DNA to a suspect or eliminate suspects) - Paternity testing - Identification of remains - Relationship testing in wildlife conservation - Assessing genetic diversity within populations (e.g. for conservation breeding programmes)
Limitations of DNA profiling: - Environmental contamination or sample degradation may compromise results - Close genetic relatives (siblings, parents) can have similar profiles - Match probability calculations assume population independence; this may not hold in small communities - A matching profile does not by itself prove presence at a crime scene; other evidence is required
Genetic Engineering¶
Genetic engineering is the direct manipulation of an organism's genome by inserting, deleting or modifying specific DNA sequences. The key application is the production of transgenic organisms — organisms carrying a gene from a different species.
Key Tools¶
Restriction endonucleases (restriction enzymes): - Cut DNA at specific recognition sequences (palindromic sequences of 4–8 base pairs) - Different enzymes recognise different sequences (e.g. EcoRI recognises GAATTC) - Most produce sticky ends — short, single-stranded overhangs — which are complementary to overhangs produced by the same enzyme in another piece of DNA - Some cut bluntly
DNA ligase: - Seals the phosphodiester bonds between fragments of DNA - Used to join the insert (gene of interest) to the vector
Vectors: - Vehicles that carry the DNA insert into the host cell - Most common vector: plasmid (small circular DNA molecule found naturally in bacteria) - Other vectors: bacteriophages (viruses that infect bacteria), yeast artificial chromosomes (YACs), liposomes (lipid vesicles, used for gene therapy in humans)
Procedure for Producing a Recombinant Plasmid¶
- Isolate the gene of interest from donor DNA using restriction enzymes (or synthesise it chemically using the mRNA sequence and reverse transcriptase to produce cDNA)
- Cut the vector (plasmid) with the same restriction enzyme — both gene and plasmid now have complementary sticky ends
- Mix the gene and plasmid under conditions that allow complementary base pairing between sticky ends
- DNA ligase seals the phosphodiester bonds, creating a recombinant plasmid
- Introduce the recombinant plasmid into a host bacterium (e.g. E. coli) using electroporation — brief high-voltage pulses that temporarily increase membrane permeability — or calcium ion treatment and heat shock
- Bacteria that have successfully taken up the recombinant plasmid are identified using marker genes
Identifying Transformed Bacteria Using Marker Genes¶
Antibiotic resistance markers: - If the plasmid contains a gene for antibiotic resistance (e.g. ampicillin resistance) - Bacteria are grown on media containing that antibiotic - Only bacteria that took up the plasmid survive (they are resistant) - However, this only identifies bacteria that took up any plasmid, not necessarily a recombinant one (with the insert)
Insertional inactivation: - The gene of interest is inserted into the middle of a second marker gene (e.g. a lacZ gene producing β-galactosidase, which turns a substrate blue) - Bacteria with non-recombinant plasmid (no insert): marker intact → colonies turn blue - Bacteria with recombinant plasmid (insert disrupts marker): marker non-functional → colonies remain white - White colonies contain the recombinant plasmid with the gene of interest
Applications¶
- Human insulin production: insulin gene inserted into E. coli plasmid; bacteria produce human insulin for diabetes treatment
- Human growth hormone: produced in engineered bacteria
- Chymosin (used in cheese production): produced from engineered fungi; replaces animal rennet
- Herbicide-resistant crops (e.g. glyphosate-resistant soya): herbicide-resistance gene inserted into crop genome; allows use of herbicide on fields containing only the crop
- Insect-resistant crops (Bt crops): Cry toxin genes from Bacillus thuringiensis inserted into crop genome; crops produce their own insecticide
Ethical Considerations¶
Arguments for genetic engineering of crops and microorganisms: - Increased crop yields; reduced pesticide use - Production of medicines (insulin, vaccines) at low cost - Potential to alleviate nutritional deficiencies (e.g. Golden Rice with β-carotene)
Arguments against: - Potential environmental effects (gene flow to wild relatives; effects on non-target organisms) - Reduced genetic diversity in crops (monocultures at risk) - Commercial control of seed supply; impact on small-scale farmers in developing countries - Concerns about long-term safety of consuming GM foods (contested; no confirmed harmful effects yet) - Ethical objection to crossing species boundaries
Gene Therapy¶
Gene therapy is the insertion of a normal (functional) allele into cells of a person with a genetic disorder caused by a faulty allele. There are two approaches:
| Type | Target cells | Effect passed to offspring? | Notes |
|---|---|---|---|
| Somatic gene therapy | Body cells (e.g. lung epithelial cells in cystic fibrosis) | No | Temporary; cells divide and replace treated cells, so repeated treatment needed |
| Germline gene therapy | Fertilised egg or early embryo | Yes (heritable change) | Permanent cure; affects all cells; raises major ethical concerns; not currently permitted in most countries |
Vectors for gene therapy: - Retroviruses: integrate the therapeutic gene into the host cell chromosome (permanent expression); risk of insertional mutagenesis (integration near an oncogene could cause cancer) - Adenoviruses: infect many cell types; do not integrate (transient expression); may trigger immune response - Liposomes: lipid vesicles that fuse with the cell membrane; low efficiency; no immune response; do not integrate
Example — cystic fibrosis: - CFTR gene (coding for the cystic fibrosis transmembrane conductance regulator) is inserted into a liposome or adenovirus vector - Vector is inhaled as an aerosol - The gene is expressed in lung epithelial cells, producing functional CFTR protein - Current treatments still experimental; clinical success has been limited
Key Terms¶
- DNA sequencing: determination of the order of bases in a DNA molecule.
- Dideoxynucleotide (ddNTP): chain-terminating nucleotide used in Sanger sequencing.
- Polymerase chain reaction (PCR): technique that amplifies a specific DNA sequence by repeated heating and cooling cycles.
- Gel electrophoresis: method that separates DNA fragments by size as they move through a gel in an electric field.
- DNA profiling: identification of individuals by comparing patterns in their DNA.
- Recombinant DNA: DNA molecule formed by joining DNA from different sources.
- Restriction endonuclease: enzyme that cuts DNA at a specific recognition sequence.
- DNA ligase: enzyme that joins together DNA fragments by forming phosphodiester bonds.
- Plasmid vector: small circular DNA molecule used to carry foreign DNA into a host cell.
- Marker gene: gene used to identify cells that have taken up recombinant DNA.
- Genetic engineering: deliberate modification of an organism’s DNA using biotechnology.
- VNTR (variable number tandem repeat): non-coding repetitive DNA sequence where the number of repeats varies between individuals; used in DNA profiling.
- Genomics: the study of whole genomes using DNA sequencing and computational tools.
- Bioinformatics: the use of computational tools and databases to analyse biological sequence data.
- Proteomics: the study of the complete set of proteins expressed by a genome.
- cDNA (complementary DNA): a DNA copy made from an mRNA template using reverse transcriptase.
- Reverse transcriptase: enzyme that synthesises DNA from an RNA template.
- Gene therapy: treatment that introduces a functional allele into cells to correct a genetic disorder.
Connected Pages¶
- 6.1.1 Cellular control (gene mutations; the genes being engineered)
- 6.1.2 Patterns of inheritance
- 6.2.1 Cloning and biotechnology (use of microorganisms in biotechnology)
- 5.1.4 Hormonal communication (insulin production by GM bacteria)
- 2.1.3 Nucleotides and nucleic acids (DNA structure and replication)
- Module 6: Genetics, evolution and ecosystems