Thursday 16 August 2018

An interlude: up by the bootstraps

And we're back. Wonder if anyone'll notice before Christmas?

First couple of posts are going to relate to an old matter, and the first really is a bit of a scene setter. You'll see why in the next post.

Today we're going to play with some statistics.

Standard error- a thickie's guide

If you only have a small dataset, then the 'standard error' within that dataset could well be large. In English? A birder who goes once a month to the nearby reserve and says it's his 'local patch' would produce a pretty wobbly write up for common species.

If you have a large dataset, then the 'standard error' within that dataset is much more likely to be small. Another birder on the same site, out 300+ days of the year, would have the better grasp on the site.

If you only count 'your' site on three dates in three months, then there's a really high chance you will have missed the more meaningful days. But a dozen once a month-ers could combine their data. In the olden days, that was the local grapevine and the Ornithological Society. Now it's taken to higher levels by the stats gurus.


Confidence interval- a thickie's guide

A confidence interval is a posh way of saying the correct value is likely to be between 'x' and 'y'.

If your dataset has a couple of stupid counts in it, say an average of a 1,000 but one morning when there were just 6 seen (fog) or 2,000 (migration fallout) then you might have no confidence in those outlying results. You can throw 'em out by simply ignoring. Or you can throw 'em out by having a confidence interval. By only using, say, the middle 95% of your data. Ignore those best and the worst. (WeBS helps their counters with this by asking them to note if their area coverage was complete; their counters know low won't make the national stats cauldron when they work magic on the numbers.)

Now 95% is a pretty common CI in statistics. You can hold a pretty good confidence level in it.

Three observers count a flock. 2,000, 3,000, 4,000. Can't do too much with that. And just throwing out a number means birders getting antsy about high/low counts and individual birders pee'd that they're not believed.

Now if 10 observers had submitted a count (p'raps there was a rare that day), 1,000, 1,750, 1,800, 1,850, 1,875, 1,900, 2,000, 2,125, 2,140, 4,000. You can start to have some confidence the actual flock size is somewhere between 1,750-2,140. And work towards publishing a meaningful figure in the annual report.

The magic tricks used by Statisticians aren't really tricks; within their framework they mirror common sense reality. There's a whole field, a flavour of the month right now, called Bayesian Statistics, that works this way with probabilities. Like a stage magician's audience, us mere mortals can but marvel at how the results they produce are somehow 'right'. True magic!

But they still need raw data. What I've been musing on is how they cope when a dataset is small.


Bootstraps!

So you've got a small dataset. Say just a hundred entries. You find your standard errors and confidence intervals, but you still don't trust them- they might be weak, but you can't go out and collect any more data.

Simples! Do the magic trick over and over. Just re-use your original data, over and over again. Witchcraft?! It certainly is- one method, despite many in the audience doubting, that works.

(The Magician's secret revealed)

Okay, the forgettable bit. If you really wanted to, how do you bootstrap?

You only have a 100 pieces of data.

Print out all your data on little individual slips, and put all 100 in a bag.

Draw one out, and write it down. That's your new data.

Put it back in the bag.

Draw again, from all 100 individual slips.

You now have two new pieces of data to work with.

And repeat.

And repeat.

And repeat until you have thousands and thousands and thousands of new pieces of data..

Now use this huge dataset to calculate your stats. (Thank the Lord for computer programmes that can do all that drawing out, eh!?)

Anyone with a stats O level or similar, please don't sue me. I'm fick and this is an imprecise guide. I only want the average birder to have a rough idea of how and why their BBSs, Birdtracks, WeBS, get played with. We often think the figures we've provided do the hard work. Well, without them the magic couldn't happen. But the magic has to happen.


Challenging results that are challenging

The reason I'm saying all this? To explain when the stat wizards say the results of one methodology are compatible to another, they've probably used wizardry like bootstrapping. There are ways and means to compare results gained from one methodology against results from another. A bird survey that needs only three visits has been compared to a daily survey, and shown to have, once tinkered with, results are are comparable to another. Of course, neither might be actually precise! If the confidence levels overlap, we're getting there.

For a practical survey methodology the statisticians cut data collection to the bare, it ensure it is robust enough to produce a result comparable to the actual.

Win win. The observer, the citizen scientist, has a figure they have absolute faith in. They will tell you there were 4,000 of 'x' last week absolute certainty. And they're happy. Of course, they could still be well out. It doesn't really matter when the stat wizards might well receive 999 different figures from 999 observers that they can play with.. Win win. Observer happy, statisticians happy. All the worrying about whether there's 500, 700, 900 present. Doesn't really matter.

No, they just guide us by giving us, well, guidelines. In your BBS, trying not going out with a team of 3 or 4 counters- better if all the counters have just one set of eyes. In your Garden Birdwatch, sticking to the time limits. On WeBS, coordinating counts. They're guidelines. They know we're human and some go out with a mate, some will go way beyond the time limits to get the extra species and some will count on the covering tide on a weekday, rather than high at the weekend, because that's when the most birds are there.

And it's why you should never cheat. I knew one site where one excellent birder was out every day, and got some great peak monthly counts. And then that area's WeBS counter went through the birder's figure (without their permission/knowledge) and instead claimed every peak monthly count as the number seen on WeBS day. Done with best intent, but hyper-inflating the site's value against others.

If a national survey, they are looking for national comparables. You're not doing it to get 'high counts'.

Regional trends can be pulled out by a statistician once they play with confidences, once they vet the data to remove those who admit to relaxing the guidelines. Locally though, they might not have enough confidence in your dataset to pull out a meaningful result. Why they appeal for as many records as possible.


-------

Now, you're probably wondering why I've rambled on. Well, what if you have different surveys running in tandem? Comparability and compatibility. That's the background for the next blogpost.