A key focus of our class this week has been “distributions”. A distribution is simply an arrangement of values of a variable such as the population size of a state. A “probability distribution,” is an arrangement of all the values (potential outcomes) of a variable that reflect the frequency of those values in nature. A distribution can either be empirical, which means that it is an actual bunch of numbers, or it can be theoretical, in which case we are just imagining an ideal arrangement of numbers. The normal or “bell” curve is just such a theoretical distribution.

### Step 1: Write, test, and submit the necessary code in R

The R open source statistical system is great at creating empirical distributions that are made up of randomly generated numbers. The book includes several commands and explanations of randomly generated distributions. Lets work on creating a normal distribution using R and then work on creating a function provides more information about the distirbution.

# 1. Generate a normal distribution of 1000 samples, with a mean of 80

# 2. Write a function that takes three arguments – a vector, a min and a max, and returns the percentage of elements in the vector that are between the min and max (including the min and max)

# 3. Use the function to see how many of your normal distribution samples are between the range of 79 to 81

# 4. Repeat 3 times, to see if the results vary.


### Step 2: Write, test, and submit the necessary code in R

We can explore called a “Pareto” distribution. We can use R to generate a Pareto distribution of state populations that may be quite similar to the populations of the actual U.S. states. In other words, we can generate random numbers for the sizes of the Fictional States of America (FSA). You can use ??


# 1 & 2. Generate 51 random numbers in a Pareto distribution and assign them to a variable called “FSApops.” Specify a “location” and a “shape” for your Pareto distribution that makes it as similar as possible to the actual distribution of state populations.

# 3. Create a histogram that shows the distribution of values in FSApops. Hint: hist()

# 4. Report the actual mean and the actual standard deviation of the 51 values stored in FSApops.

# 5. Report the population of your largest fictional state (i.e., your California) and your smallest fictional state (i.e., your Wyoming). 

Hints:

• The necessary R command for generating random numbers in a Pareto distribution is located in a “package” called “VGAM”. The code you submit should include the two necessary commands for making this happen (they should be the first two commands in your code). In the VGAM package, you will find a command for generating random numbers that fit a Pareto distribution.

• You will have to look up the meaning of the location and shape parameters so that you can figure out how to set them to make your Fictional States of America as similar to the real states as possible.
• The scale/location parameter sets the position of the “left edge” of the probability density. The only outcomes that can be observed are greater than the value of the scale/location parameter. Your scale location has to be > 0.
• The shape parameter determines the steepness of the “ski slope.”
• You will determine your shape and scale based on the US states population data we have been using. One way to find a good distribution is to vary the parameters. You are encouraged to play around with these numbers and get it as close to looking like our state pops as you can. In other words, change (or “play” with) the location and shape to see how they influence the distribution, and then pick location and shape that make the distribution closest to what you want.
• Note that random numbers will differ substantially every time you run the command, so we don’t expect your data to be a perfect match. You do want your smallest state to be about the size of Wyoming and about 15 of your states to be under 2 million in population.

• A pareto distribution will look like a ski slope .. high on the left and a long tail down hill to the right (see pic below).

### Learning Goals for this activity:

A. Generate random numbers in a Pareto distribution and assign a variable name.
B. Specify a “location” and a “shape” for a distribution to conform to a model.
C. Create a histogram depicting a distribution of values.
D. Use R commands to report mean and standard deviation.
E. Use appropriate R command to report the most extreme values of a variable.

LS0tCnRpdGxlOiAiTGFiIDM6IERpc3RyaWJ1dGlvbnMgYW5kCUZ1bmN0aW9ucwktIEdlbmVyYXRpbmcJRGlzdHJpYnV0aW9ucyIKYXV0aG9yOiAKLSBbWU9VUiBOQU1FXQotIFtZT1VSIFBBUlRORVJTIE5BTUVdCmRhdGU6ICJgciBTeXMudGltZSgpYCIKb3V0cHV0OiBodG1sX25vdGVib29rCi0tLQoKQSBrZXkgZm9jdXMgb2Ygb3VyIGNsYXNzIHRoaXMgd2VlayBoYXMgYmVlbiAiZGlzdHJpYnV0aW9ucyIuIEEgZGlzdHJpYnV0aW9uIGlzIHNpbXBseSBhbiBhcnJhbmdlbWVudCBvZiB2YWx1ZXMgb2YgYSB2YXJpYWJsZSBzdWNoIGFzIHRoZSBwb3B1bGF0aW9uIHNpemUgb2YgYSBzdGF0ZS4gQSDigJxwcm9iYWJpbGl0eSBkaXN0cmlidXRpb24s4oCdIGlzIGFuIGFycmFuZ2VtZW50IG9mIGFsbCB0aGUgdmFsdWVzIChwb3RlbnRpYWwgb3V0Y29tZXMpIG9mIGEgdmFyaWFibGUgdGhhdCByZWZsZWN0IHRoZSBmcmVxdWVuY3kgb2YgdGhvc2UgdmFsdWVzIGluIG5hdHVyZS4gQSBkaXN0cmlidXRpb24gY2FuIGVpdGhlciBiZSBlbXBpcmljYWwsIHdoaWNoIG1lYW5zIHRoYXQgaXQgaXMgYW4gYWN0dWFsIGJ1bmNoIG9mIG51bWJlcnMsIG9yIGl0IGNhbiBiZSB0aGVvcmV0aWNhbCwgaW4gd2hpY2ggY2FzZSB3ZSBhcmUganVzdCBpbWFnaW5pbmcgYW4gaWRlYWwgYXJyYW5nZW1lbnQgb2YgbnVtYmVycy4gVGhlIG5vcm1hbCBvciAiYmVsbCIgY3VydmUgaXMganVzdCBzdWNoIGEgdGhlb3JldGljYWwgZGlzdHJpYnV0aW9uLiAKCmBgYHtyIHNldHVwLCBpbmNsdWRlPUZBTFNFfQojIEluc3RhbGwgcGFja2FnZXMgaWYgbmVjZXNzYXJ5CgojIHJlbWVtYmVyIGRvIG5vdCBpbnN0YWxsIHBhY2thZ2VzIGluIHRoZSBtYXJrZG93bi4gCiNpbnN0YWxsLnBhY2thZ2VzKCJWR0FNIikgCiNsaWJyYXJ5KFZHQU0pCmBgYAoKIyMjIFN0ZXAgMTogV3JpdGUsIHRlc3QsIGFuZCBzdWJtaXQgdGhlIG5lY2Vzc2FyeSBjb2RlIGluIFIKVGhlIFIgb3BlbiBzb3VyY2Ugc3RhdGlzdGljYWwgc3lzdGVtIGlzIGdyZWF0IGF0IGNyZWF0aW5nIGVtcGlyaWNhbCBkaXN0cmlidXRpb25zIHRoYXQgYXJlIG1hZGUgdXAgb2YgcmFuZG9tbHkgZ2VuZXJhdGVkIG51bWJlcnMuIFRoZSBib29rIGluY2x1ZGVzIHNldmVyYWwgY29tbWFuZHMgYW5kIGV4cGxhbmF0aW9ucyBvZiByYW5kb21seSBnZW5lcmF0ZWQgZGlzdHJpYnV0aW9ucy4gTGV0cyB3b3JrIG9uIGNyZWF0aW5nIGEgbm9ybWFsIGRpc3RyaWJ1dGlvbiB1c2luZyBSIGFuZCB0aGVuIHdvcmsgb24gY3JlYXRpbmcgYSBmdW5jdGlvbiBwcm92aWRlcyBtb3JlIGluZm9ybWF0aW9uIGFib3V0IHRoZSBkaXN0aXJidXRpb24uIAoKYGBge3J9CiMgMS4gR2VuZXJhdGUgYSBub3JtYWwgZGlzdHJpYnV0aW9uIG9mIDEwMDAgc2FtcGxlcywgd2l0aCBhIG1lYW4gb2YgODAKCiMgMi4gV3JpdGUgYSBmdW5jdGlvbiB0aGF0IHRha2VzIHRocmVlIGFyZ3VtZW50cyDigJMgYSB2ZWN0b3IsIGEgbWluIGFuZCBhIG1heCwgYW5kIHJldHVybnMgdGhlIHBlcmNlbnRhZ2Ugb2YgZWxlbWVudHMgaW4gdGhlIHZlY3RvciB0aGF0IGFyZSBiZXR3ZWVuIHRoZSBtaW4gYW5kIG1heCAoaW5jbHVkaW5nIHRoZSBtaW4gYW5kIG1heCkKCiMgMy4gVXNlIHRoZSBmdW5jdGlvbiB0byBzZWUgaG93IG1hbnkgb2YgeW91ciBub3JtYWwgZGlzdHJpYnV0aW9uIHNhbXBsZXMgYXJlIGJldHdlZW4gdGhlIHJhbmdlIG9mIDc5IHRvIDgxIAoKIyA0LiBSZXBlYXQgMyB0aW1lcywgdG8gc2VlIGlmIHRoZSByZXN1bHRzIHZhcnkuCgpgYGAKCiMjIyBTdGVwIDI6IFdyaXRlLCB0ZXN0LCBhbmQgc3VibWl0IHRoZSBuZWNlc3NhcnkgY29kZSBpbiBSCldlIGNhbiBleHBsb3JlIGNhbGxlZCBhICJQYXJldG8iIGRpc3RyaWJ1dGlvbi4gV2UgY2FuIHVzZSBSIHRvIGdlbmVyYXRlIGEgUGFyZXRvIGRpc3RyaWJ1dGlvbiBvZiBzdGF0ZSBwb3B1bGF0aW9ucyB0aGF0IG1heSBiZSBxdWl0ZSBzaW1pbGFyIHRvIHRoZSBwb3B1bGF0aW9ucyBvZiB0aGUgYWN0dWFsIFUuUy4gc3RhdGVzLiBJbiBvdGhlciB3b3Jkcywgd2UgY2FuIGdlbmVyYXRlIHJhbmRvbSBudW1iZXJzIGZvciB0aGUgc2l6ZXMgb2YgdGhlIEZpY3Rpb25hbCBTdGF0ZXMgb2YgQW1lcmljYSAoRlNBKS4gWW91IGNhbiB1c2UgPz8KIApgYGB7cn0KCiMgMSAmIDIuIEdlbmVyYXRlIDUxIHJhbmRvbSBudW1iZXJzIGluIGEgUGFyZXRvIGRpc3RyaWJ1dGlvbiBhbmQgYXNzaWduIHRoZW0gdG8gYSB2YXJpYWJsZSBjYWxsZWQg4oCcRlNBcG9wcy7igJ0gU3BlY2lmeSBhIOKAnGxvY2F0aW9u4oCdIGFuZCBhIOKAnHNoYXBl4oCdIGZvciB5b3VyIFBhcmV0byBkaXN0cmlidXRpb24gdGhhdCBtYWtlcyBpdCBhcyBzaW1pbGFyIGFzIHBvc3NpYmxlIHRvIHRoZSBhY3R1YWwgZGlzdHJpYnV0aW9uIG9mIHN0YXRlIHBvcHVsYXRpb25zLiAKCiMgMy4gQ3JlYXRlIGEgaGlzdG9ncmFtIHRoYXQgc2hvd3MgdGhlIGRpc3RyaWJ1dGlvbiBvZiB2YWx1ZXMgaW4gRlNBcG9wcy4gSGludDogaGlzdCgpCgojIDQuIFJlcG9ydCB0aGUgYWN0dWFsIG1lYW4gYW5kIHRoZSBhY3R1YWwgc3RhbmRhcmQgZGV2aWF0aW9uIG9mIHRoZSA1MSB2YWx1ZXMgc3RvcmVkIGluIEZTQXBvcHMuIAoKIyA1LiBSZXBvcnQgdGhlIHBvcHVsYXRpb24gb2YgeW91ciBsYXJnZXN0IGZpY3Rpb25hbCBzdGF0ZSAoaS5lLiwgeW91ciBDYWxpZm9ybmlhKSBhbmQgeW91ciBzbWFsbGVzdCBmaWN0aW9uYWwgc3RhdGUgKGkuZS4sIHlvdXIgV3lvbWluZykuIApgYGAKCkhpbnRzOiAKCiogVGhlIG5lY2Vzc2FyeSBSIGNvbW1hbmQgZm9yIGdlbmVyYXRpbmcgcmFuZG9tIG51bWJlcnMgaW4gYSBQYXJldG8gZGlzdHJpYnV0aW9uIGlzIGxvY2F0ZWQgaW4gYSDigJxwYWNrYWdl4oCdIGNhbGxlZCDigJxWR0FN4oCdLiBUaGUgY29kZSB5b3Ugc3VibWl0IHNob3VsZCBpbmNsdWRlIHRoZSB0d28gbmVjZXNzYXJ5IGNvbW1hbmRzIGZvciBtYWtpbmcgdGhpcyBoYXBwZW4gKHRoZXkgc2hvdWxkIGJlIHRoZSBmaXJzdCB0d28gY29tbWFuZHMgaW4geW91ciBjb2RlKS4gSW4gdGhlIFZHQU0gcGFja2FnZSwgeW91IHdpbGwgZmluZCBhIGNvbW1hbmQgZm9yIGdlbmVyYXRpbmcgcmFuZG9tIG51bWJlcnMgdGhhdCBmaXQgYSBQYXJldG8gZGlzdHJpYnV0aW9uLgoKKiBZb3Ugd2lsbCBoYXZlIHRvIGxvb2sgdXAgdGhlIG1lYW5pbmcgb2YgdGhlIGxvY2F0aW9uIGFuZCBzaGFwZSBwYXJhbWV0ZXJzIHNvIHRoYXQgeW91IGNhbiBmaWd1cmUgb3V0IGhvdyB0byBzZXQgdGhlbSB0byBtYWtlIHlvdXIgRmljdGlvbmFsIFN0YXRlcyBvZiBBbWVyaWNhIGFzIHNpbWlsYXIgdG8gdGhlIHJlYWwgc3RhdGVzIGFzIHBvc3NpYmxlLgogICAgKiBUaGUgc2NhbGUvbG9jYXRpb24gcGFyYW1ldGVyIHNldHMgdGhlIHBvc2l0aW9uIG9mIHRoZSDigJxsZWZ0IGVkZ2XigJ0gb2YgdGhlIHByb2JhYmlsaXR5IGRlbnNpdHkuIFRoZSBvbmx5IG91dGNvbWVzIHRoYXQgY2FuIGJlIG9ic2VydmVkIGFyZSBncmVhdGVyIHRoYW4gdGhlIHZhbHVlIG9mIHRoZSBzY2FsZS9sb2NhdGlvbiBwYXJhbWV0ZXIuIFlvdXIgc2NhbGUgbG9jYXRpb24gaGFzIHRvIGJlID4gMC4KICAgICogVGhlIHNoYXBlIHBhcmFtZXRlciBkZXRlcm1pbmVzIHRoZSBzdGVlcG5lc3Mgb2YgdGhlIOKAnHNraSBzbG9wZS7igJ0gCiAgICAqIFlvdSB3aWxsIGRldGVybWluZSB5b3VyIHNoYXBlIGFuZCBzY2FsZSBiYXNlZCBvbiB0aGUgVVMgc3RhdGVzIHBvcHVsYXRpb24gZGF0YSB3ZSBoYXZlIGJlZW4gdXNpbmcuIE9uZSB3YXkgdG8gZmluZCBhIGdvb2QgZGlzdHJpYnV0aW9uIGlzIHRvIHZhcnkgdGhlIHBhcmFtZXRlcnMuIFlvdSBhcmUgZW5jb3VyYWdlZCB0byBwbGF5IGFyb3VuZCB3aXRoIHRoZXNlIG51bWJlcnMgYW5kIGdldCBpdCBhcyBjbG9zZSB0byBsb29raW5nIGxpa2Ugb3VyIHN0YXRlIHBvcHMgYXMgeW91IGNhbi4gSW4gb3RoZXIgd29yZHMsIGNoYW5nZSAob3IgInBsYXkiIHdpdGgpIHRoZSBsb2NhdGlvbiBhbmQgc2hhcGUgdG8gc2VlIGhvdyB0aGV5IGluZmx1ZW5jZSB0aGUgZGlzdHJpYnV0aW9uLCBhbmQgdGhlbiBwaWNrIGxvY2F0aW9uIGFuZCBzaGFwZSB0aGF0IG1ha2UgdGhlIGRpc3RyaWJ1dGlvbiBjbG9zZXN0IHRvIHdoYXQgeW91IHdhbnQuIAogIAoqIE5vdGUgdGhhdCByYW5kb20gbnVtYmVycyB3aWxsIGRpZmZlciBzdWJzdGFudGlhbGx5IGV2ZXJ5IHRpbWUgeW91IHJ1biB0aGUgY29tbWFuZCwgc28gd2UgZG9u4oCZdCBleHBlY3QgeW91ciBkYXRhIHRvIGJlIGEgcGVyZmVjdCBtYXRjaC4gWW91IGRvIHdhbnQgeW91ciBzbWFsbGVzdCBzdGF0ZSB0byBiZSBhYm91dCB0aGUgc2l6ZSBvZiBXeW9taW5nIGFuZCBhYm91dCAxNSBvZiB5b3VyIHN0YXRlcyB0byBiZSB1bmRlciAyIG1pbGxpb24gaW4gcG9wdWxhdGlvbi4gCgoqIEEgcGFyZXRvIGRpc3RyaWJ1dGlvbiB3aWxsIGxvb2sgbGlrZSBhIHNraSBzbG9wZSAuLiBoaWdoIG9uIHRoZSBsZWZ0IGFuZCBhIGxvbmcgdGFpbCBkb3duIGhpbGwgdG8gdGhlIHJpZ2h0IChzZWUgcGljIGJlbG93KS4KCiFbXShkaXN0ci5wbmcpIAoKCiMjIyBMZWFybmluZyBHb2FscyBmb3IgdGhpcyBhY3Rpdml0eTogIApBLiBHZW5lcmF0ZSByYW5kb20gbnVtYmVycyBpbiBhIFBhcmV0byBkaXN0cmlidXRpb24gYW5kIGFzc2lnbiBhIHZhcmlhYmxlIG5hbWUuICAKQi4gU3BlY2lmeSBhIOKAnGxvY2F0aW9u4oCdIGFuZCBhIOKAnHNoYXBl4oCdIGZvciBhIGRpc3RyaWJ1dGlvbiB0byBjb25mb3JtIHRvIGEgbW9kZWwuICAKQy4gQ3JlYXRlIGEgaGlzdG9ncmFtIGRlcGljdGluZyBhIGRpc3RyaWJ1dGlvbiBvZiB2YWx1ZXMuICAKRC4gVXNlIFIgY29tbWFuZHMgdG8gcmVwb3J0IG1lYW4gYW5kIHN0YW5kYXJkIGRldmlhdGlvbi4gIApFLiBVc2UgYXBwcm9wcmlhdGUgUiBjb21tYW5kIHRvIHJlcG9ydCB0aGUgbW9zdCBleHRyZW1lIHZhbHVlcyBvZiBhIHZhcmlhYmxlLgoK