Analyzing Histograms and Creating Classes

In the Histogram Analysis dialog you can plot and analyze the distribution of scalar data contained in a multi-ROI, mesh, graph, or vector field, compute basic statistics, as well as create classes by selecting instances in the data that match some criteria.

Click the Histogram tool on the Data Properties and Settings panel or click the Histogram Analysis button on the Analyze and Classify Measurements module to open the Histogram Analysis dialog, as shown in the following screen captures.

Histogram Analysis dialog

Histogram Analysis dialog

The Histogram Analysis dialog includes a Statistics tab, on which you can compute basic statistics, such as the minimum, maximum, mean, and median values within the selected data (see Computing Basic Statistics), as well as a Classification tab, on which you can create classes by selecting instances in the data that match some criteria (see Creating Classes). Instances can be added to classes on 1D histograms with the Range Selector tool and by painting on 2D histograms.

The following options are available for analyzing histograms of the scalar values contained in a multi-ROI, mesh, graph, or vector field.

Histogram options
  Description
Tools panel The tools at the top of the dialog let you to pan, zoom, and reset the histogram, as well as save the figure and export the plotted values in the comma-separated values (*.csv extension) file format (see Histogram Tools).
Histogram Displays an approximate representation of the distribution of the selected scalar value(s), in which the range of values are divided into 'bins' on a 1D or 2D histogram. In other words, it provides a visual interpretation of numerical data by showing the number of data points that fall within a specified range of values.

You can select the scalar slot(s) from which values will be extracted and plotted in the Primary measurement (X) and Secondary measurement (Y) drop-down menus as follows:

Primary measurement (X)… The selected measurement will be plotted on a 1D histogram.

Secondary measurement (Y)… The selected measurements will be plotted on a 2D histogram using Cartesian or polar coordinates. 2D histograms are useful when you need to analyze the relationship between 2 numerical variables that have a large number of values.

Note For meshes, you may also need to selected a scalar type — Face Scalar Values or Vertex Scalar Values — in the Scalar type drop-down menu. For graphs, you will need to select a scalar type — Edge Scalar Values or Vertex Scalar Values — in the Scalar Type drop-down menu, as shown below.

If you selected Cartesian for the 2D histogram, then Cartesian coordinates will be used to display the selected measurements in which the X-axis represents the values of the primary measurement and the Y-axis the values of the secondary measurement.

If you selected Polar for the 2D histogram, then the histogram will be drawn in the polar coordinate system — a two-dimensional coordinate system where each point is determined by a distance from a fixed point and an angle from a fixed direction. Rather than using the standard X and Y coordinates, each point on a polar plane is expressed using these two values:

  • Radius (r), which is the distance from the center of the plot.

  • Theta (q), which is the angle from a reference angle.

The plane itself is made up of concentric circles expanding outward from the origin, or the pole. Polar plots are often used when the analyzing data that has a cyclical nature, as shown in the example below.

Polar plot

Note Secondary measurements for polar plots are automatically divided into 8 bins with ranges of 0-45, 45-90, 90-135, 135-180, 180-225, 225-270, and 270-360. In this case, values of 90 degrees will be binned in the range 90-135.

Y log If selected, the Y-axis will be plotted in log scale on 1D histograms or the intensity of squares or sectors will be increased on 2D histograms.
Bin count Determines the interval in which values will be binned.

Whenever a histogram is constructed, the first step is to 'bin' the range of values — that is, divide the entire range of values into a series of intervals — and then count how many values fall into each interval. The height of each bin shows how many values from that data fall into that range. The bins are consecutive, non-overlapping intervals of a variable.

Note To get meaningful results, selecting an appropriate bin count is crucial. Bins that are too wide can hide important details about distribution while bins that are too narrow can cause spikes just by coincidence.

Statistics tab Lets you select the statistic(s) that will be computed and shown on the histogram and in the Statistics box (see Computing Basic Statistics).
Classification tab

Includes options for creating and editing classes (see Creating Classes).