Gaussian Distribution - Everything2.com

(thing)

by Cyt

Thu May 04 2000 at 10:06:48

AKA The Normal distribution. The statistical distribution. Its importance flows from the fact that :

Any sum of Normal distributed variables is itself a Normal distributed variable
Sums of variables that, individually, are not Normal distributed tend to become Normal distributed (asymptotically)

You won't find many stochastic variables on this planet that are not Gaussian of nature.

Example

The number of single girls in a bar (when measuring eg. every day at noon in the same bar
The number of cars passing a point on the highway (go ahead: Spend an hour a day - a 1000 days in a row and see the nice distribution curve smoothing more and more until it is perfectly Gaussian
The height of Japanese people
The number of bytes/links/images on a homepage (this one would be easy to check

Let Z be a Gaussian distributed stochastic variable with mean=0 and standard deviation=1.

Prob(|Z|>=1) <= 0.3173105
Prob(|Z|>=1.96) <= 0.0499957
Prob(|Z|>=3.29055) <= 0.0010000

For the not so much into mathematics reader:
The small list shows that the probability of finding a value in the data set that is more than 3.29055 times higher than the standard deviation is 1 in a thousand. So - if all cars on the highway are doing 50 plus/minus 10, only one car in a thousand will do more/less than 50+10*3.29055 which is about 83. (Or to use the first entry in the list: The chance that there are more single girls in a bar than normally is 31.7%/2= 15.8% - go push your luck!) Well folks - that's all for now. Thanks for letting me use this place as a test stage for my thesis, where I'm actually discussing small uninteresting matters like this (focusing a little less on single girls, though)

And to ariels - yes - you're absolutely right. You'd also never find a car going faster than the speed of light, even though it SHOULD happen de temps en temps if the velocities were truly Gaussian distributed. Forgive my engineer-geekish way of looking at things (eg. 0.98 is not close to 1, it IS 1)

I like it!

1 C!

(idea)

by ariels

Thu May 04 2000 at 13:07:12

All of the examples above are of non-negative quantities, but the Gaussian distribution is unbounded, and in particular always attains negative values with non-zero probability! So all the examples are wrong, at least in the strict sense.

Whether the Gaussian or Poisson distribution is more common depends on what, exactly, you measure. But it is true that many naturally occurring random variables are approximately Gaussian. This is a consequence of the Central Limit Theorem alluded to above: the average of N iid random variables (which have variance, if you must get technical)) converges a.s. to a Gaussian variable. So if you look at people's heights, they're not normally distributed (since they're always positive). But (presumably due to some underlying stochastic process) it can be modelled with reasonable accuracy as a sum of iid random variables; this, in turn, may be approximated by a normal distribution.

Just don't confuse the pretty mathematical model with what really goes on.

Engineers, Physicists, Statisticians, Computer Scientists, Astronomers, and all the others! Hmmph! I don't know why we allow them to use Mathematics, I really don't...

I like it!

(thing)

by Noether

Mon Aug 14 2000 at 12:47:11

                                 THE
                                NORMAL
                             LAW OF ERROR
                           STANDS OUT IN THE
                         EXPERIENCE OF MANKIND
                        AS ONE OF  THE BROADEST
                       GENERALIZATIONS OF NATURAL
                     PHILOSOPHY . IT SERVES AS THE
                   GUIDING INSTRUMENT IN RESEARCHES
                IN THE PHYSICAL AND SOCIAL SCIENCES AND
               IN MEDICINE AGRICULTURE AND ENGINEERING .
          IT IS AN INDISPENSABLE TOOL FOR THE ANALYSIS AND THE
INTERPRETATION OF THE BASIC DATA OBTAINED BY OBSERVATION AND EXPERIMENT

--W.J. Youden

I like it!

2 C!s

(thing)

by LionMan

Sat Oct 07 2000 at 4:59:09

This is the normal density curve defined by a mean of 0 and standard deviation of 1, also called the standardized normal curve. A formula for generating the density curve is:
P=(e^(-(x^2)/2))/sqrt(2*pi)
To determine the probability of an event in a normally distributed set of data of occuring, integrate this function from -infinity to (x-mu)/sigma where mu is the mean and sigma is the standard deviation.

I like it!

(thing)

by Cermain

Tue Nov 14 2000 at 21:06:42

A mainstay of statistics, this curve is symmetric around a single mode (which also happens to be the mean.) It has inflection points at one and two standard deviations to either side of the mean. (Note: jt claims differentiation shows that there's inflection points at one standard deviation on either side... I haven't checked the math yet.)

The curve is described by the following equation: y = (1 / sqrt(2 * pi * sigma^2)) * e^(-(x - a)^2 / (2 * sigma^2))

...where a is the mean and sigma is the standard deviation. Also known as the Gaussian or the Normal curve bell curve, or the Laplace-Gaussian curve. Karl Pearson is apparently the person responsible for the term normal, which he coined in order to avoid a naming dispute, but which he apparently now regrets since it incorrectly implies that all other distributions of data are somehow abnormal.

What does this mean to you?

Gaussian curves appear all over the place. IQ is assumed to follow a normal curve, with 100 being the mean (average), and half of the population falling above the mean, half below. Test scores for well-defined tests often fall into this shape. A lot of science, especially social science, tends to assume that data fits this pattern and chooses the statistical tests to used based on that assumption. T-tests and ANOVAs, for example, assume that the samples come from a normally distributed population.

Most statistics books contain tables at the back which list the probability that something occurs however many standard deviations away from the mean of the curve.

I like it!

1 C!

(idea)

by Garrett

Sat Mar 17 2001 at 20:32:38

Statistical term for a symmetric curve in which the measures of central tendency (mean, median, and mode) are all equal. This is extremely important in statistics, as many continuous distributions approximate the normal curve. Using a table of values (or integrating with Calculus), we can find the area under the curve, and thus the probability.

1 standard deviation on either the side of the mean makes up approximately 68% of all observations.
2 standard deviations on either the side of the mean make up approximately 95% of all observations.
3 standard deviations on either the side of the mean make up approximately 99.7% of all observations.

I like it!

1 C!

(idea)

by pimephalis

Fri Jun 22 2001 at 13:12:23

Not to be overly nit-picky, but blaaf's write up is not entirely accurate. He provides the values of the standard normal distribution, which is the normal distribution when the mean is 0 and standard deviation 1. This is a particularly useful normal distribution (ie. it is used to simplify calculations in statistical tests), but only one of an infinite set.

The equation for the normal curve is:

f(x) = e^{(-(x-μ)²/2σ²)} / √(2πσ²)

Where μ is the mean of the distribution, σ is the standard deviation of the distribution and f(x) is the probability density function. As you can easily see, if the mean is 0 and standard deviation 1, the normal curve becomes:

f(z) = e^-z²/2 / √2π

I like it!

Poisson distribution	normal distribution	Your e-mail client must be this secure before you may ride the internet	Central Limit Theorem
Gaussian curve	How to make a halo of fire in Photoshop	multidimensional normal distribution	standard deviation
indiscrete mathematics	The Bell Curve	Kendall's Notation	IID
The Galton quincunx	Gaussian	Stochastic	Girl Power
bell-shaped curve	Kinsey scale	Superhero	engineer
I refuse all betas	Stupid movie reviews are killing people	Cutoff filter	Jeepers Creepers 2