Interactive Statlets - Data Exploration

Statgraphics has more than 30 special interactive procedures called Statlets. Statlets are displayed in a special window with a control bar across the top. Changes made to the controls are immediately reflected in the Statlet window. In the case of 3-dimensional displays, scrollbars are included that allow dynamic rotation of the output.

This page displays Statlets that provide interactive implementations of various exploratory data analysis procedures.

Statlet
Bivariate Density Estimator
Demographic Map Brushing
Frequency Histogram
Ridgeline Plots
Sunflower Plot
Trellis Plots
Trivariate Density Estimator
Violin Plots

Frequency Histogram

The Interactive Histogram Statlet creates a frequency histogram for a column of numeric data. The controls on the toolbar make it easy to change the definition of the classes into which the data are grouped. The density function of a normal distribution with the same mean and standard deviation as the data may be superimposed on the histogram. In addition, a nonparametric density trace may be drawn.

Bivariate Density Estimator

The Bivariate Density Statlet creates a frequency histogram for 2 columns of numeric data. It is used to visualize the joint distribution of 2 random variables. The controls on the toolbar make it easy to change the definition of the classes into which the data are grouped. The density function of a bivariate normal distribution with the same means, standard deviations, and covariance as the data may be created instead of a histogram. In addition, a nonparametric density estimator may be drawn.

Trivariate Density Estimator

The Trivariate Density Statlet displays the estimated density function for 3 columns of numeric data. It does so using either a 3-dimensional contour plot or a 3-dimensional mesh plot. The joint distribution of the 3 variables may either be assumed to be multivariate normal or be estimated using a nonparametric approach.

           

Demographic Map Brushing

The Demographic Map Brushing Statlet plots a demographic map in which each region is colored either blue or red to illustrate the value of a selected variable. Using the Statlet controls, the analyst may change the cutoff that divides red from blue. Interactively changing the cutoff helps in visualizing the distribution of the specified variable throughout the map.

Sunflower Plot

The Sunflower Plot Statlet is used to display an X-Y scatterplot when the number of observations is large. To avoid the problem of overplotting point symbols with large amounts of data, glyphs in the shape of sunflowers are used to display the number of observations in small regions of the X-Y space.   

Ridgeline Plots (2D and 3D)

The Ridgeline Plots display the distribution of a numeric variable across multiple levels of a categorical factor. They may include histograms, fitted distributions, and nonparametric density estimates. Both 2D and 3D versions are available.

Violin Plots

The Violin Plot Statlet displays data using a combination of a box-and-whisker plot and a nonparametric density estimator. It is very useful for visualizing the shape of the probability density function for the population from which the data came. Violin plots may be constructed for a single sample or for multiple samples.

Trellis Plots

Trellis Plots are segmented plots that display data for each combination of one or more conditioning variables. For example, histograms of the distribution of height among individuals might be displayed side-by-side for men and women. The plots are designed to help users visualize how data change across levels of the conditioning variables.

Statgraphics provides 4 types of trellis plots:

1.      Numeric Y: displays characteristics of a single quantitative variable at different combinations of 1 or 2 conditioning variables using either a histogram, box and whisker plot or normal probability plot.

2.      Categorical Y: displays characteristics of a single categorial variable at different combinations of 1 or 2 conditioning variables using either a barchart, piechart or donut chart.

3.      Y vs X: displays the relationship between 2 variables at different combinations of 1 or 2 conditioning variables using either a scatterplot or a regression curve.

4.      Z vs X and Y: displays the relationship between 3 variables at different combinations of 1 or 2 conditioning variables using a bubble chart, a regression model, or a nonparametric smoother.