The plots
Data visualization in Mirage is provided through four basic views
plus some special plots.
The four basic views are:
- Table view:
A table view of the data matrix, with a color tag attached to each row
that shows its membership in partitional clusters.
Points can be selected by a mouse dragged to cover the corresponding
rows, and highlighted by changing the background color of the
rows.
- Histogram view:
A histogram plot that can be reconfigured to focus on any single
feature. This shows one-dimensional projections of the data set,
divided into a (choosable) number of groups. The groups give
a partitional structure of the data set in this subspace.
One can select one or more clusters by drawing an interval with a
mouse. Partitions from other sources are shown by coloring each
bar in heights proportional to corresponding fractions of members in
each cluster. A selected subset is highlighted and can be tracked in
another subspace by reconfiguring the plot to another feature, or
broadcasted to other displays.
- Scatter plot:
A scatter plot that displays a two-dimensional projection of the
data, where the X and Y axes can be chosen to be any of the feature
dimensions. Regions in the projection plane can be selected by
drawing boxes or irregular regions with a mouse. The selected points
are highlighted and the same subset can be tracked as the plot is
reconfigured to show a different pair of features, or broadcasted to
to other plots.
- Feature vector plot:
A feature vector plot is also known as a plot of parallel coordinates or
profiles. This plot shows the projection of data on a
multi-dimensional subspace by plotting the value of every feature
against the index of that feature in the subspace. That is, a point
projected on a subspace of m dimensions as (z1, ..., zm) is
shown as a curve with nodes marked at (i,zi) for each i in [1,m].
This plot is a natural display for vectors such as a spectrum
represented as intensities in each channel, or a time series that has
values at each time step. Vectors of measurements on incomparable
scales need to be first standardized so that each component has
mean 0 and standard deviation 1. Data can be selected and
broadcasted from this plot by drawing intervals in each feature
dimension and composing unions or intersections of such intervals.
Highlights and partitions are shown by coloring the curves. The plot
can be reconfigured to show vectors in different subspaces with
selections preserved. Feature vectors are defined by format
statements "format vec vecname x1 x2 x3 ... ". The vectors of the
same name that are associated with different data entries are assumed
to share the same set of indices.
Operations on the plots
Data in every plot can be selected by mouse operations that draw
one or more boxes enclosing the selected region. This operation is
available by clicking the rectangle icon in the right tool bar.
Additional shapes such as an irregular region or Bezier curve are
applicable in scatter plots. Intersection or union of multiple
selected regions can be formed by toggling the intersection/union icon
in the right tool bar.
Some plots have choosable actions built in, such as changing the
axes, or stepping through each data entry. These actions can be
triggered by pressing the circle icons in each plot. Pressing the
solid circle triggers or stops a continuous action. Broken circles are
for one step of the action in the forward or backward direction.
Operations with selected data
Data selected from each plot can be colored, shown in isolation
in one of the four basic views, or broadcast to other plots.