BIOL 322 Lab 3. Genetic Drift

When you have completed the exercise below, submit your model and your lab write-up via the Google Drive Form (https://forms.gle/ E3QLACX1aA2Q6ist5). Responses due by the following week's lab period.

During lab, we will work in groups. Here are the groups assigned for today's exercise:

Martha and DianeRevanth and Vaughn
Emily F. and IsabelRicarda and My (both absent)
Andrea and KalyanAidan and Sophie
Merry and EliseAmelia and Emily S.
Calvin and ShainaHunter and Julia

Overview

Today, we'll create a modified version of our Hardy-Weinberg model from last week and we'll add genetic drift. Think about how we describe the effects of genetic drift on genetic variation. Starting with our base HW model, which parameters should we adjust and what things do we need to add to incorporate genetic drift?

This is somewhat complicated, since drift is a stochastic process. Remember from Chapter 6, we know the variance in allele frequency change due to drift; we can use that to add drift to our gene pool. "Drift" will be a change to the allele frequency, where the value will be a random number chosen from a normal distribution with a mean of zero and the binomial standard deviation. In this way, we add (or subtract) a small, random number to the allele frequency each generation. We'll make this change and then use the adjusted allele frequency (padj) to randomly combine alleles to form zygotes.

Note that this should mean our population will always remain in HW proportions, even though allele frequencies will be changing each generation! This is a good demonstration of the fact that nonrandom mating is the only thing that puts a population out of HW proportions. For other evolutionary forces, we must examine change in allele frequency across generations to detect the effects.

Building the Drift Model

To begin, we'll set up a model similar to the HW one, except we'll only keep track of 1 allele (which makes it easier to add the stochasticity to the model without messing things up!). Here's a screenshot of the final product, for reference as we proceed:

Image of the completed drift model

  1. Open a new model and input the following settings:
  2. Units for Time Year
    INIITIAL TIME = 0
    FINAL TIME = 1000
    Time Step 1

  3. Create a stock for each genotype, perhaps named: genoA1A1, genoA1A2, genoA2A2. Like in the image, be sure to leave space for the additional variables we'll be adding!

  4. Next, add the flows, one per genotype stock. Be sure that the arrow on the flow connects to the stock. We name these delta11, deltaA1A1, deltaA1A2, deltaA2A2.

  5. Now we have a number of additional variables to add:
  6. Variable Description
    N population size, which we'll want to manipulate with Simulation Control
    p0 this is the starting frequency of the A1 allele, we'll also vary this with Simulation Control
    freqA1A1 the observed frequency of the A1A1 genotype
    freqA1A2 the observed frequency of the A1A2 genotype
    freqA2A2 the observed frequency of the A2A2 genotype
    p this is the frequency of the p allele, measured every generation, this is the primary variable we are interested in tracking to determine effects of other variables on it
    driftSD the standard deviation for the distribution of changes due to drift; this will vary depending on the population size and is used in the drift variable
    drift the actual amount by which we will change the value of p for the next generation, causing change due to drift
    p adj the adjusted value of p, so the new value of p after adding the effect of drift; this is what we use to create the next generation
    toggledrift this variable let's us run the model with or without causing drift to happen; you adjust it once per run, drift is either on or off for the whole simulation, makes it easy to see what the effect of drift is compared to HW equilibrium

  7. Next, you need to add the appropriate arrows, try to follow the image above. You can check the Arrows Key table if needed (see the drop-down menu).
  8. Once we have our arrows placed, we are now in position to enter the equations for each variable. If we don't have the arrows correct, we won't be able to put the right equations in, so pay careful attention.

  9. Remember, to enter equations, click on the Equations icon and then click on the variable whose info you want to edit. In the box that pops up, you enter the correct information for that variable. We have a lot of variables so let's take this slowly.

  10. Let's start with the variables that we are just setting a value to initiate the model. For these, the equation is simple, it's just the starting value we are choosing to use. Use the following equations for the varibles below:
  11. Variable Parameter Value Min Max Incr
    p0 Equations 0.5 0 1 0.01
    N Equations 100 0 500,000 1

  12. Having set up N and p0, we can enter the initial values for the genotypes. Since we are assuming that mating is random, we expect genotypes to occur according to HW proportions, so we'll use those equations to set the initial value for our genotypes, multiplying by N so we get the number of individuals, not just the frequency.
    • Make sure you do NOT change the equation entry for the genotypes, that is already set up correctly to be the value of the flow.

    • Click on 'genotypeA1A1'. What should you enter for the initial value? (p0 * p0 * N)

    • For genotypeA1A2, use the equation N - genoA1A1 - genoA2A2. That ensures that we end up with the right number of total individuals, N.

  13. Once we have the numbers for the genotype stocks, we can set up the equations for freq11, freq12 and freq22. We mainly want these variables for ease of graphing and manipulation to do other calculations. What should the equations for these variables be?
  14. Now to the variable of greatest interest: p! How do we calculate the frequency of the allele using the genotype frequencies? If you aren't sure, check with Angie. You can also check the key at the end though it's best if you can try to work it out.

  15. For the variables related to drift (driftSD, seed, drift, and toggledrift and p adj), the equations are somewhat complicated so we'll learn a little about how they work as we enter them into the model. Also see the table below for quick reference.

  16. Ok, that was the hardest part. All we have left are the equations for deltaA1A1, deltaA1A2, and deltaA2A2. Because we are causing drift to happen on allele frequency directly, these equations will be similar to what we used last week with the HW model, just using p adj instead of p. So, for deltaA1A1, we get:

    deltaA1A1 equation is: (N * p adj * p adj) - genoA1A1

    Go ahead and set up the equations for deltaA1A2 and deltaA2A2 as well.

  17. Once you have finished entering equations, check to make sure they are correct (you may want to use the key at the end). Then we are ready to click Simulate to run the model. To get a graph, select the three genotype stocks (or whatever variable you want) and then click Graph. Remember that we set up the initial conditions so that drift will NOT occur (toggledrift = 0) so we expect to see just the HW model.
    • What happened? Does anything change on the graphs? What did you expect to see? Consider your responses as you write up the lab exercise in the next section.

  18. Now, to make sure that drift can occur as expected, we need to use Simulation Control to alter the value of toggledrift and then Simulate the model again.

  19. Once you have created those variations, click Simulate on the horizontal menu bar, click on p, and then click Graph. A graph should appear that has a line for each value of toggledrift that you tested. What happened? How might we explain what's going on?

    Note that for the drift model, we are primarily interested in what happens to the allele frequency, p, over time. So instead of graphing genotype frequencies, we want to graph allele frequency. We could also graph heterozygosity, as that's our common estimate of genetic variation.

    (Sidebar: you should see a flat line for the nodrift run and then two squiggly lines that are different from each other for run1 and run2.)

Lab Write-Up

Now that you have created an awesome working drift model, let's see what happens with multiple runs and altering starting conditions. Remember that since drift is a stochastic process, we cannot predict the outcome of any particular run but we can see trends from multiple runs. Thus, we'll need to look at multiple runs of any conditions we are testing out to identify trends. Remember that the effect of drift depends on both initial allele frequencies and total population size (a larger population takes longer to reach equilibrium).

For the questions below, you should pay attention to the following values for each run so that you can compare across different runs. Which variables change the outcome and in what ways? Does the time to fixation change? Does the size of change from one generation to the next change?

So you will be graphing p and possibly freqA1A2 (heterozygosity).

Examine the trends associated with these variables when you run the scenarios below. Before running them, try to predict what you think will happen and which variables will matter the most. Consider how the differences between scenarios demonstrates the way in which drift influences genetic diversity.

Questions

  1. Explore the effect of initial allele frequency on the outcome of the runs. Do 10 runs for each of the following sets of starting conditions:
    Trial p0 N Final Time toggledrift # runs
    1a 0.5 100 1000 1 to 10 10
    1b 0.2 100 1000 1 to 10 10
    1c 0.8 100 1000 1 to 10 10

    The easiest way to do this is to change the default values for p0, N, and Final Time then use Simulation Control to just change the value of toggledrift 10 times. You want 1 graph of p showing all 10 runs of trial 1a, a separate graph of the run for trial 1b and for trial 1c. You can change the Final Time by clicking on the Model drop-down menu and choosing Settings... I also suggest saving your graph for each trial. To save a graph, go to File --> Save. You can also rescale and zoom in or out by clicking on the graph (use the View menu to reset the scale).

    Now, tell me what you notice about what happens in these runs. Is equilibrium reached? Fixation or extinction? What changes when you change starting allele frequency? Write a summary of your observations.

  2. Examine what happens when you change population size. Use the following sets of starting conditions:
    Trial p0 N Final Time toggledrift # runs
    2a 0.3 10 1000 1 to 10 10
    2b 0.3 100 1000 1 to 10 10
    2c 0.3 1000 1000 1 to 10 10

    Again, you should do at least 10 runs for each value of population size so that you can try to detect any trends. What trends do you notice? How does population size influence the equilibrium that's reached? Feel free to explore what happens with other values of N or with other values for p0. Write up a summary of your observations.

  3. Finally, in paragraph form, give me a summary of the overall ideas/patterns demonstrated in the drift model (you don't need to repeat yourself from earlier questions). Reflect on what you've observed. Are any of the results surprising to you? Or do they change how you think about the consequences of drift? How might these results influence thinking about drift as a potential issue in conservation? (You are, of course, welcome to test out other conditions in your model!)

Remember to submit your model and responses as a PDF via the Google Drive Form: https://forms.gle/E3QLACX1aA2Q6ist5. Write-up are due by the next lab period.


Key to Arrows

Below find the key to arrow placements for each variable.

Variable Arrow pointing to Arrow arriving from
genoA1A1 freqA1A1; deltaA1A1; genoA1A2 p0; N
genoA1A2 freqA1A2; deltaA1A2 N; genoA1A1; genoA2A2
genoA2A2 freqA2A2; deltaA2A2; genoA1A2 p0; N
N genoA1A1; genoA1A2; genoA2A2; freqA1A1; freqA1A2; freqA2A2; deltaA1A1; deltaA1A2, deltaA2A2; driftSD none
freqA1A1 p genoA1A1; N
freqA1A2 p genoA1A2; N
freqA2A2 none genoA2A2; N
p freqA1A1; freqA1A2 p adj; driftSD
driftSD drift N; p
toggledrift none p adj; drift
drift p adj driftSD; toggledrift
p adj deltaA1A1; deltaA1A2; deltaA2A2 p; drift; toggledrift
deltaA1A1 none p adj; N; genoA1A1
deltaA1A2 none p adj; N; genoA1A2
deltaA2A2 none p adj; N; genoA2A2

Key to Equations

Below find the table of equations for each variable.

Variable Parameter Value of equations or initial
N Equations 100
p0 Equations 0.5
genoA1A1 Initial p0 * p0 * N
genoA1A2 Initial N - genoA1A1 - genoA2A2
genoA2A2 Initial (1 - p0) * (1 - p0) * N
freqA1A1 Equations genoA1A1 / N
freqA1A2 Equations genoA1A2 / N
freqA2A2 Equations genoA2A2 / N
p Equations freqA1A1 + 0.5 * freqA1A2
driftSD Equations SQRT( (p * (1 - p) ) / (2 * N) )
toggledrift Equations 0
drift Equations RANDOM NORMAL( -2, 2, 0, driftSD, toggledrift)
p adj Equations IF THEN ELSE(toggledrift = 0, p , IF THEN ELSE( (p + drift) >= 1 , 1 , IF THEN ELSE( (p + drift) <= 0 , 0 , p + drift ) ) )
deltaA1A1 Equations (p adj * p adj * N) - genoA1A1
deltaA1A2 Equations (2 * p adj * (1 - p adj) * N) - genoA1A2
deltaA2A2 Equations ( (1 - p adj) * (1 - p adj) * N) - genoA2A2