## Discrete and Continuous Random Variables

In statistics, a random variable is a variable whose values represent outcomes of a random phenomenon. We usually write random variables in capital letters. For example, let X be the random variable that represents the outcome of a die roll. The possible values of X are 1, 2, 3, 4, 5, and 6.

### Discrete random variables

Discrete random variables are those whose values are isolated or separated from each other. A random variate can take on only certain definite, isolated values. Examples of discrete random variables include the number of heads in two tosses of a coin, the number of children in a family, and the number of defective items in a lot. The set of probabilities associated with the values in a discrete random variable sample space is called a probability mass function (PMF).

A probability mass function must satisfy two conditions:

- Every probability mass function must be nonnegative over the entire sample space. This condition ensures that we are dealing with probabilities and not negative values or something else entirely.
- The sum of all the probabilities in the sample space must equal 1. This condition is known as the normalization condition and ensures that there is a 100% chance that something will happen. If this were not true, then there would be some events which could never occur since their probability would be zero.

Continuous random variables

A random variable is continuous if it can take on any value within a certain range. For example, the weight of a newborn baby is a continuous random variable because it can assume any value between, say, 2.5 kg and 4 kg. Essentially, there are an infinite number of possibilities for the value of a continuous random variable.

There are two types of continuous random variables: those that follow the normal distribution and those that don’t. The normal distribution is also known as the Gaussian distribution and is represented by the bell curve. Continuous random variables that don’t follow the normal distribution are referred to as non-normal or skewed distributions.

## Probability Distributions

Probability Distributions are important in statistics and machine learning because they help us understand how our data is distributed. This can be helpful in numerous ways. For example, if we are trying to predict the probability of something happening, we can use a distribution to see how likely it is.

### Discrete probability distributions

A discrete probability distribution is a list of possible values that a random variable can take on, along with the associated probabilities. If a random variable can take on only a finite (or countable) number of values, then it has a discrete probability distribution.

The probabilities associated with the values in a discrete probability distribution must meet the following criteria:

-They must all be greater than or equal to 0.

-They must all be less than or equal to 1.

-They must sum to 1.

### Continuous probability distributions

A probability distribution is a mathematical function that provides the probabilities of occurrence of different possible outcomes in an experiment. The function is defined over the sample space of possible outcomes. A random variable is a variable whose value is not known in advance but which is subject to some probability distribution. In contrast, a non-random variable has a known constant value.

A continuous probability distribution is a probability distribution in which there are an infinite number of possible outcomes. The probabilities of the possible outcomes are required to sum to 1. Continuous distributions are often used to model physical phenomena, such as measuring the height of people, where there are an infinite number of possible heights that could be measured.

There are many types of continuous distributions, but the most common are the normal (or Gaussian) distribution, the exponential distribution, and the uniform distribution. The normal distribution is often used to model data that has been collected from a large number of independent observations, such as IQ scores or test scores. The uniform distribution is used when all outcomes are equally likely, such as when rolling a die or flipping a coin. The exponential distribution models events that occur at a constant rate, such as radioactive decay or arrival times at a bus stop.

The probabilities associated with continuous random variables can be determined using calculus. Probability density functions (PDFs) describe the relative likelihoods of different outcomes and can be used to calculate probabilities. For example, if we wanted to know the probability that someone chosen at random from a population would have a height between 5 feet and 6 feet, we could use a normal PDF to calculate this probability.

## Sampling

Sampling is the process of selecting a representative group from a population for the purpose of measuring the population. A sample is a subset of the population. Population parameters are estimated from samples. Sampling can be done in a variety of ways.

### Sampling from a population

In statistics, sampling from a population is the process of determining which individuals to include in a study. This can be done in a number of ways, but the two most common are probability sampling and convenience sampling.

Probability (or random) sampling is the process of selecting a sample from a population in such a way that each individual has an equal chance of being selected. This is usually done by randomly assigning numbers to each individual in the population and then selecting those with specific number ranges.

Convenience sampling, on the other hand, is the process of selecting individuals for a study based on their accessibility or convenience. For example, researchers might choose to study students at their university because they are easy to access. However, this method can lead to bias because those who are easy to access may not be representative of the population as a whole.

It’s important to choose the right type of sampling method for your study in order to ensure that your results are accurate and representative of the population.

### Sampling from a distribution

In statistics, sampling from a distribution is the act of selecting a subset of individuals from within a population to study. The individuals who are selected for the sample are chosen based on a set of predetermined criteria, and once selected, they become part of the sample. The sample is then used to represent the entire population from which it was drawn.

There are several different types of sampling that can be used, and the type that is used will depend on the specific research question that is being addressed. Some of the most common types of sampling include:

-Random sampling: This is the most basic type of sampling, and it involves selecting a subset of individuals from the population at random. This type of sampling is often used when there is no specific criterion that needs to be met in order for an individual to be included in the sample.

-Stratified sampling: This type of sampling involves dividing the population into distinct strata, or groups, and then selecting a random sample from each stratum. This approach is often used when there are subgroups within the population that need to be represented in the sample.

-Cluster sampling: This type of sampling involves dividing the population into distinct groups, or clusters, and then selecting a random sample of clusters. Within each selected cluster, all individuals are included in the sample. This approach is often used when it is difficult or expensive to obtain a complete list of all individuals in the population.

## Estimation

Estimation is the process of finding an approximate value of some quantity by using information that is available. In statistics, estimation refers to the process of using sample data to calculate the value of a population parameter. The quantity that is being estimated is called the estimand, and the statistics that are used to estimate the estimand are called estimators.

### Point estimation

In statistics, point estimation is the process of estimating a population parameter by using a single data point from the population. For example, if you wanted to estimate the mean weight of all players in the National Football League, you could take a random sample of 10 players and compute their average weight. This average would be your point estimate of the mean weight of all NFL players.

There are two types of point estimators: unbiased and biased estimators. An unbiased estimator is an estimator that has zero bias, meaning that it is just as likely to overestimate the population parameter as it is to underestimate the population parameter. A biased estimator is an estimator that is systematically biased, meaning that it is more likely to either overestimate or underestimate the population parameter.

There are many different point estimators that can be used to estimate a population parameter, and different estimators will have different properties (e.g., bias, efficiency). There is no “best” estimator; the best estimator to use depends on the particular situation and what qualities are most important (e.g., bias, efficiency).

### Interval estimation

An interval estimate is a statistic derived from a sample that is used to estimate a population parameter. An interval estimate is usually expressed as a range, with a lower and an upper limit, and it is construed so that the population parameter lies within those limits with some prescribed degree of certainty. For example, if one wishes to be 95% confident that the population mean falls within the interval, one would choose the limits of the interval so that there is only a 5% chance that the population mean lies outside those limits.

There are two types of interval estimates: confidence intervals and prediction intervals. A confidence interval estimates a population parameter, such as a population mean or proportion, whereas a prediction interval predicts the value of a future observation based on past observations.

Confidence intervals are usually constructed so that there is a 95% or 99% chance that the population parameter lies within the limits of the confidence interval. However, other levels of confidence can be used (e.g., 90%, 96%, etc.), depending on how precise one wants the interval to be. The wider the confidence interval, the less precise it is. For example, if we were interested in estimating the mean length of time it takes students to complete their degrees at State University, we could take a random sample of students and ask them how long it took them to complete their degrees. We could then use those data to construct a 95% confidence interval for the mean length of time it takes students to complete their degrees at State University. This confidence interval would tell us that we are 95% confident that the true mean length of time it takes students to complete their degrees at State University lies somewhere between X and Y (the lower and upper limits of our confidence intervals).

Prediction intervals are similar to confidence intervals in that they are also used to estimate population parameters. However, prediction intervals are used to predict future observations rather than estimate population parameters. For example, suppose we wished to predict how long it will take future students to complete their degrees at State University based on data from past students. We could take a random sample of past students and ask them how long it took them to complete their degrees at State University. We could then use those data to construct a prediction interval for future students who wish to complete their degrees at State University. This prediction interval would tell us that we are 95% confident that any given student who wishes to complete their degree at State University will take between X and Y (the lower and upper limits of our predictioninterval) years to do so.