Statistics is the study of the collection, organization, analysis, interpretation and presentation of data using mathematics.
It deals with all aspects of data including the planning of data collection in terms of the design of surveys and experiments.
When analyzing data, it is possible to use one or both of statistics methodologies: descriptive and inferential statistics in the analysis data.
In applying statistics to a scientific, industrial, or societal problem, it is necessary to begin with a population or process to be studied. Populations can be diverse topics such as "all persons living in a country" or "every atom composing a crystal". A population can also be composed of observations of a process at various times, with the data from each observation serving as a different member of the overall group. Data collected about this kind of "population" constitutes what is called a time series.
For practical reasons, a chosen subset of the population called a sample is studied—as opposed to compiling data about the entire group (an operation called census). Once a sample that is representative of the population is determined, data is collected for the sample members in an observational or experimental setting.
This data can then be subjected to statistical analysis, serving two related purposes: description and inference.
summarize the population data by describing what was observed in the sample numerically or graphically. Numerical descriptors include:
- mean and standard deviation for continuous data types (like heights or weights),
- while frequency and percentage are more useful in terms of describing categorical data .
uses patterns in the sample data to draw inferences about the population represented, accounting for randomness. These inferences may take the form of:
- answering yes/no questions about the data (hypothesis testing),
- estimating numerical characteristics of the data (estimation),
- describing associations within the data (correlation) and modeling relationships within the data (for example, using regression analysis).
Inference can extend to forecasting, prediction and estimation of unobserved values either in or associated with the population being studied; it can include extrapolation and interpolation of time series or spatial data, and can also include data mining.