Interactive Statlets - Data Exploration
| Statlet |
| Bivariate Density Estimator |
| Demographic Map Brushing |
| Frequency Histogram |
| Ridgeline Plots |
| Sunflower Plot |
| Trellis Plots |
| Trivariate Density Estimator |
| Violin Plots |
The Interactive Histogram Statlet creates a frequency histogram for a column of numeric data. The controls on the toolbar make it easy to change the definition of the classes into which the data are grouped. The density function of a normal distribution with the same mean and standard deviation as the data may be superimposed on the histogram. In addition, a nonparametric density trace may be drawn.
The Bivariate Density Statlet creates a frequency histogram for 2 columns of numeric data. It is used to visualize the joint distribution of 2 random variables. The controls on the toolbar make it easy to change the definition of the classes into which the data are grouped. The density function of a bivariate normal distribution with the same means, standard deviations, and covariance as the data may be created instead of a histogram. In addition, a nonparametric density estimator may be drawn.
The Trivariate Density Statlet displays the estimated density function for 3 columns of numeric data. It does so using either a 3-dimensional contour plot or a 3-dimensional mesh plot. The joint distribution of the 3 variables may either be assumed to be multivariate normal or be estimated using a nonparametric approach.
The Demographic Map Brushing Statlet plots a demographic map in which each region is colored either blue or red to illustrate the value of a selected variable. Using the Statlet controls, the analyst may change the cutoff that divides red from blue. Interactively changing the cutoff helps in visualizing the distribution of the specified variable throughout the map.
The Sunflower Plot Statlet is used to display an X-Y scatterplot when the number of observations is large. To avoid the problem of overplotting point symbols with large amounts of data, glyphs in the shape of sunflowers are used to display the number of observations in small regions of the X-Y space.
The Ridgeline Plots display the distribution of a numeric variable across multiple levels of a categorical factor. They may include histograms, fitted distributions, and nonparametric density estimates. Both 2D and 3D versions are available.
The Violin Plot Statlet displays data using a combination of a box-and-whisker plot and a nonparametric density estimator. It is very useful for visualizing the shape of the probability density function for the population from which the data came. Violin plots may be constructed for a single sample or for multiple samples.
Trellis Plots are segmented plots that display data for each combination of one or more conditioning variables. For example, histograms of the distribution of height among individuals might be displayed side-by-side for men and women. The plots are designed to help users visualize how data change across levels of the conditioning variables.
Statgraphics provides 4 types of trellis plots:
1. Numeric Y: displays characteristics of a single quantitative variable at different combinations of 1 or 2 conditioning variables using either a histogram, box and whisker plot or normal probability plot.
2. Categorical Y: displays characteristics of a single categorial variable at different combinations of 1 or 2 conditioning variables using either a barchart, piechart or donut chart.
3. Y vs X: displays the relationship between 2 variables at different combinations of 1 or 2 conditioning variables using either a scatterplot or a regression curve.
4. Z vs X and Y: displays the relationship between 3 variables at different combinations of 1 or 2 conditioning variables using a bubble chart, a regression model, or a nonparametric smoother.