A spread plot (also called scatterplot , scatter chart , scatter chart , scattergram , or a search diagram ) is a plot type or a mathematical diagram that uses Cartesian coordinates to display values ââfor normally two variables for a set of data. If the points are color-coded, one additional variable can be displayed. Data is displayed as a set of points, each having a value of one variable that determines the position on the horizontal axis and the value of another variable that determines the position on the vertical axis.
Video Scatter plot
Overview
A scatter plot can be used either when one continuous variable is under the control of the experiment and the other depends on it or when the two continuous variables are independent. If parameters exist that are systematically increased and/or subtracted by others, these are called control parameters or independent variables and are usually plotted along the horizontal axis. The measured or dependent variable is usually plotted along the vertical axis. If no dependent variable exists, both types of variables can be plotted on one axis and scattered plots will illustrate only the correlation level (not the cause) between the two variables.
The scatter plot can show different types of correlations between variables with certain confidence intervals. For example, the weight and height, weight will be on the y-axis and the height will be on the x-axis. Correlation may be positive (up), negative (falling), or zero (uncorrelated). If the dot pattern is from the lower left to the top right, this indicates a positive correlation between the variables being studied. If the pattern of the dotted points from the top left to the bottom right, this indicates a negative correlation. The most suitable line (or 'trendline') can be drawn to study the relationship between variables. The equations for the correlation between variables can be determined by setting the most appropriate procedure. For linear correlation, the most suitable procedure is known as linear regression and is guaranteed to produce the right solution within a limited time. No universal fit procedure is guaranteed to produce the right solution for arbitrary relationships. A scatter plot is also very useful when we want to see how two comparable sets of data agree to show nonlinear relationships between variables. The ability to do this can be enhanced by adding fine lines like LOESS. Furthermore, if the data is represented by a mixed model of simple relationships, this relationship will be visually seen as a superimposed pattern.
The scatter diagram is one of seven basic tools of quality control.
Scattering chart can be created in the form of bubbles, markers, or/and line charts.
Maps Scatter plot
Example
For example, to show the relationship between a person's lung capacity, and how long that person can hold his breath, a researcher will select a group of people to study, then measure their respective lung capacity (first variable) and how long a person can hold his breath second). The researcher will then plot the data in the scatter plot, assign "lung capacity" to the horizontal axis, and "hold time of breath" to the vertical axis.
A person with a 400-lung capacity that holds his breath for 21.7 seconds will be represented by a point on the scatter plot at point (400, 21.7) in Cartesian coordinates. The scatter plot of everyone in the study will allow researchers to derive visual comparison of the two variables in the data set, and will help to determine the type of relationship that may exist between the two variables.
Scatterplot matrix
For a set of data variables (dimensions) X 1 , X 2 ,..., X k , the scatter plot matrix shows all scatter plots in pairs of variables in one view with multiple scatterplots in matrix format. For k variables, the scatterplot matrix will contain rows k and k columns. The plot located at the junction of the i-th row and the jth column is the plot of the variable X i versus X j . This means that each row and column are one dimension, and each cell plots a two dimensional scatterplot.
A common scatterplot matrix offers multiple pairwise combinations of category and quantitative variables. Mosaic plots, fluctuating diagrams, or faceted bar charts can be used to display two categorical variables. Another plot is used for one category and one quantitative variable.
References
External links
- What is it spreading?
- Correlation of matrix scatter-plots for sequential data - Description and code R
- Density scatterplot for large datasets (hundreds of millions points)
Source of the article : Wikipedia