Saturday, March 15, 2014

More on Winter, Seed Germination, and Bayesian Inversion

Dreams of Spring and Bayesian Inversion
Ah…it has come to that time of year when we are all dreaming of being outside without a jacket on. Some subset of those dreamers are also thinking about their gardens, getting their fingers in the soil, and watching the miracle of life push its way up out of the soil. Some of us may have already started the year’s first plants – an act which seems to fit into the “audacity of hope” category given the number of record low temperatures set in the last week of February and first two weeks of March. But given the reality of trying to grow a hot pepper in Wisconsin, getting an early start is the only way.

Coldest Winter in Decades for Southern Wisconsin

As we can see from this data provided by the Department of Atmospheric Sciences at the University of Wisconsin, the coldest winter since 1980 for Madison, WI was the winter of 1985-86, when the average temperature (black line) was 16.2 degrees for the 3 month period from December through February.

This year will crush that mark, with an average of 14.0 degrees. And of course if you get away from the “heat island” effect of Madison, you can take a couple more degrees off that number – temperatures where I live 30 miles Southwest of Madison have averaged 12.1 degrees for the winter. You’d have to go back to the winter of 1885 to find a colder average temperature in Madison!

Now one of the great pleasures of life in a crazy cold place like this is to warm yourself up next to a wood fire stove – and we are blessed with an efficient and lovely wood stove which we tend to keep going on the particularly cold days. To see how this connects to Bayesian Inversion we need to talk about peppers – hot peppers.

Germinating Hot Peppers

You see, it turns out that for many people growing genus capsicum is something of an obsession, and while I wouldn’t call myself a “Chili Head” I certainly like spicy food and thought I’d try my hand at growing some plants this year. The cultivars I selected were the Long Thin Cayenne (30-50,000 Scoville units) and the Datil (100-300,000 Scoville units). Hot peppers are the topic of many fiery discussions in the gardening community and entire forums are dedicated to the subject. While perusing these sites for some basic advice, I realized (much to my chagrin) that it is infamously difficult to get these little suckers to germinate and sprout. In addition, I learned that the company (who will remain nameless) from which I ordered seeds has a mixed reputation. (A friend recently told me the company has been accused of being experts on seed catalogues, but being a bit spotty on seed genetics.)

So it was with some trepidation that I set out to sprout my first seeds of the year. We picked the Datil peppers as there seemed to be a consensus that the hotter peppers needed more time to grow. The Datil is an exceptionally hot pepper - a variety of the species Capsicum chinense. From the reading at the aforementioned forum, Capsicum Chinense varieties of hot peppers have an exceptionally long germination period (12-25 days). Additionally, it seemed that while there was a lot of experience on the forum, there was very little data. In fact, I noticed a thread where somebody asked for a chart of germination times and was greeted with a chorus of responses that focused on how much germination time varied, rather than responding with any data. Since I hear this objection so much in my professional life, my interest was further piqued. Certainly there would be variation in germination times, both because of seed variance and method, but it should also be possible to assemble data and report it.

It was with some surprise that I discovered more than 25% of the seeds had germinated on day five. An additional 25% germinated on day six, and by day seventeen, 78% of the seeds had germinated (see chart below).[1]   

Calculating Population Proportion
Many seed companies test a batch of seeds and report what percentage germinated on their seed packages. As we all intuitively know, the more seeds tested the more confidence we can have in the reported germination rate; but you’ll have some uncertainty in the true germination rate of the whole population (or “population proportion”) even if you do a relatively large germination sample. The consulting firm I work for specializes in measurements and small sample statistics and has a useful calculator as one of their power tools called “Bayesian Population Proportion” which allows a user to calculate confidence intervals for germination rates. For example if you test 200 seeds and 169 germinate, you would report an 84.5% germination rate. But without other prior knowledge of germination rates for that population of seed, you could only be 90% confident that the true germination rate for the whole population was between 79.7% and 88.1%. This is not entirely intuitive, but I think it is intuitive that we could not know the germination rate exactly unless we tested the whole population.
Returning to my Datil seed experiment, we can use this calculator to calculate a 90% confidence for the germination rate of the entire batch of Datils. The sample size was 18 seeds, of which 14 germinated. Using the BayesianPopProportion tool we find that the 90% confidence interval for the germination rate of the whole population is 58% to 89%. That's plenty high for me!

Using Bayesian Inversion to Calculate Required Sample Size
Different users will have different needs in terms of confidence and germination rates. For example, my bar for germination rate was a low 20% but I wanted to feel 90% confident that at least 20% would germinate or I would look to re-buy the seeds elsewhere. But I was just growing the Datils on a lark, knowing that a hot climate plant like this needs a lot of babying to produce up here in the north, and would never be a “main crop” producer in our garden. Someone who runs a Community Supported Agriculture (CSA) business would have a much higher bar for germination rates. Their livelihood depends on successful germination and seeds can make up 5-10% of their costs. Such a farmer might need to feel 95% confident that 75% of their seeds would germinate. In another circumstance I can imagine an even more restrictive example. I would want to have 95% confidence that less than 1% of a population of Space Shuttles blow up on launch before agreeing to ride one into space. But to achieve these various confidence levels, how many samples of each do we need? The answers are respectively: one seed for the Datil sample, 13 seeds for the CSA farmer, and 300 launches of a space shuttle without a failure.
I created a little tool to calculate these values (Figure 3) which I'm happy to share if you are interested. You select your requirements for confidence level and success rate, and the tool reports the required sample size given 0-3 failures.

Figure 3: Calculate sample size requirements for various confidence and population proportions.

Since we bought many of our seeds from a place with a supposed mixed reputation this year, I’m going to sample a wide variety of seeds just to see whether I need to re-order any seeds. This is where my calculator will come in handy. For the main plants in our garden, I want to have 75% confidence that 60% of the seeds will germinate. Surprisingly, I only need a sample of two seeds where both germinate to achieve this. In any sample where one or both of the seeds fail, I can test an additional nine seeds. As long as eight of eleven seeds germinate I have achieved my requirements. 

Winter was cold. I’m looking forward to spring. Using statistics to help in your daily life is a hot topic.
Please let me know if you found this article interesting by commenting below. And stop over to the Hubbard Decision Research website if you are interested in statistical tools to download.

[1] For anyone interested, I used the "wet paper towel in a baggie" method and kept the baggie behind my wood stove laying on the bricks. I did go to the trouble of finding a place where the temperature varied between 65 and 90 degrees - most of the day it is between 75 and 85 on the bricks, and it cools down below 70 for only a few hours in the early mornings each day.

No comments:

Post a Comment