Is it continuous? Would the less-than-zero problem ever actually happen, or are you just annoyed that it does?
If it's continuous and "probably normal" because it's kind of like the sum of a bunch of observations, you probably want a gamma == sum of exponential variables.
If it's discrete and "probably normal" for the same reason, you probably want negative binomial == sum of geometrics or just plain old binomial == sum of Bernoullis.
It's continuous, and the problem is likely to happen.
Basically, we have an agent-based model where all our little software people are deciding whether or not to evacuate based on whether the number of warning signs they've seen has hit their panic threshold yet. The distribution of thresholds is what I'm concerned with here.
It doesn't really make sense for the threshold to be negative, but there's a fraction of the population will evacuate given any provocation at all, so their threshold is basically zero. It's "probably normal" on the grounds that, well, it's people. So it's really based on a jillion unobservables, but we're calling it (likely normal) random noise.
(Might need to use it for a couple other things, like response delay and attention paid to official announcements, both of which make no sense being negative.)
I was thinking about a gamma distribution, actually. I just have to remember what the hell ranges of coefficients gives you halfway-normal looking curves. Thanks!
This seems weird.
I realize I don't understand your way of thinking about this concept. But let me propose a way of thinking that makes more sense to me, at least:
Let x be the rv that is the number of times a person needs to be warned before they decide to evacuate. [I realize that in your model, probably warnings have weights, but for my sanity's sake, let's say all "warnings" have weight 1.] A person's threshold is the number of warnings that it takes before they run.
It seems to me that what you actually have is just an infinite-state Markov chain modeling the people: it has states x_0, x_1, ... and RAN, where x_i's indicator i is the number of warnings the person has previously seen, if they've not evacuated yet, while RAN is the catchall state for "I ran like hell." You want to know the distribution of the time of first occupancy of RAN.
I seems odd to assume that this distribution is quasi-normal. More likely quasi-geometric sounds right to me: that would suggest that every new warning convinces the same fraction of stragglers to run.
So, the agents are integrating signals from a number of sources (weighted, yes indeed) to determine their general alarm level. And once they hit their threshold, they act. That seems to me to be equivalent to what you've said above, just continuous instead of discrete.
I don't follow once we get to the infinite-state Markov chain. How does it have any implications for the distribution? In other words, it seems like the markov chain representation is just a way of making a histogram of the agents' thresholds... what am I missing?
Note also that one of the signals the agents integrate is "who else has already evacuated", which would violate one of the assumptions for markov processes, no?
Oh, the chain is just to think about it in a different way. The point is that if the person is truly Markovian, the most no-information idea is to wrap all of the X_i states together into a single state, with probability p of "run like hell" and probability 1-p of "stay." In this context, the distribution of thresholds is geometric, and making it normal is the odd thing, not somehow a default. [Sorry--somehow I thought I'd written that, but I guess I just thought it...]
Yes, "who's already gone" would violate the Markovianness.
Ah! Okay, gotcha. Yeah, I think people are definitely non-Markovian in their decision process.
I'll fool around with gammas some. (Hopefully, when we do some sensitivity analysis of the model, one of the things we'll find out is that the details of the distribution of thresholds is not especially important. That'd be convenient.)
There's a few decent handbooks on basic distributions that are a sensible thing to have on your shelf. I don't have any of them (I have a mathematical statistics book), but getting one of them is sensible.
I think the assumption that the threshold is based on the number of warnings is flawed.
I suspect there's some fraction of the population that will run on first warning, and some that won't run at all, ever. But for the ones in between, the interesting threshold is probably the % of the population that already ran. As that happens, the odds of someone you know having run goes up sharply, and consequently your odds of doing so as well.
So I'd expect a small chunk at first, an accelerating rampup, and then a sudden plunge.
Well, I was already assuming (correctly, it appears) that his system was working based on some way of integrating who's already run into a number.
I'm actually debating whether to do it using a global "fraction evacuated" number or to do it using random encounters...
That's separate from the number of people you know personally who have evacuated, for which we need to make a simple model social network.
I suspect there's some fraction of the population that will run on first warning, and some that won't run at all, ever.
Absolutely correct. (Though there are some interesting questions about won't run versus can't run.)
But for the ones in between, the interesting threshold is probably the % of the population that already ran. As that happens, the odds of someone you know having run goes up sharply, and consequently your odds of doing so as well.
Yup. The really interesting part (I suspect) will be how the signal of other people deciding to act is filtered through people's social networks, because how your friends are acting has an entirely different kind of influence than how "everybody" is acting.
It'll probably usually be a sigmoid curve, but where it tops out and how it unfolds over time are really useful details we hope the model can provide.
attention paid to official announcements, both of which make no sense being negative
While it may be a really small proportion that wouldn't affect your program, I would note that in RL, people are perverse enough that they will pay negative attention to official announcements. Tell them that they should leave, and that't the impetus for them to stay.
I think the number of people THAT perverse, especially with regard to life-threatening emergencies, is in fact negligible. I'll have to look into whether it's negligible when it comes to things like flu shots, though. (Another system we can study with the same model.)
There's an interesting phenomenon that some subpopulations (criminals, for example) may find it advantageous NOT to evacuate if everyone else is evacuating. But if we end up modeling that, I think the way to do it will probably be to give them an extra, negative signal representing the opportunity, because nromal meaning of the other signals (mortal danger) holds just as much for them as everyone else...
I'm unconvinced that many people hear from their friends that they've gotten their flu shots. I certainly never tell anyone except da_lj
that he needs to get his. I started getting mine because a nurse practitioner yelled at me...