Simulation results visualization

Visualization panels and tabs help to explore computational results. The results include predictions, models, accuracy metrics and plots. GMDH Shell gives access not only to final predictions but also to all models used for their calculation. Since most visualization panels can show just one model as a time there is a tool called Model browser that lets you switch between target variables and models. To illustrate user interface elements we use simulation results of built-in examples.

Model browser

The Model browser organizes target variables and their models as trees. A top level record represents post-processed (final) predictions calculated for selected target while tree leaves represent one or more raw models. There are 3 cases when a target receive more then one model:

  • Time series tasks with Forecast past periods option and more then one simulation requested. (Perform simulations > 1)
  • Multivariate time series tasks with more than one forecast horizon. (for example Forecast horizon is 1,3,6,12)
  • Classification tasks with more than 2 classes. (one-vs-all classification)

Any changes in the Model browser will be visualized in related panels immediately. When a raw model is selected you can also access lower Model ranks outperformed by the top-ranked model. One of the reasons for inspecting suboptimal models is that they can be averaged in the postprocess module. To see the results of averaging you should necessarily select postprocessed results (root node) in the Model browser.

Watch a video tutorial (in Adobe Flash format).

Plot

Simulation results Plot|

Plot tab in the Simulation results is intended to help visually estimate quality of regression or time series models. The Plot is completely interactive like any other visualization in GMDH Shell. When a model is selected in the Model browser it immediately appears in the Plot as a blue curve with red addition that marks model forecast. If available, actual data is plotted as a gray curve.

If forecast values has no their own IDs or Timestamps then these points will be marked with +1, +2, +3 and so on. The Plot outputs postprocessed (final) predictions using a thick line with dots. In the raw mode models are plotted with a thin line together with thick dots of postprocessed predictions. You can switch between final predictions and raw models using the Model browser.

Current model panel

Model complexity: 3 of 45

The current model (Rank 1, Time point 24) is a sum of 4 components selected out of 61 available components.

Criterion value: 2.3367

Testing performance of the current model in terms of the selected validation criterion.

Model components & coefficients

Should be interpreted as the following model:

Rmax[t] = 1.078*Year[t-1] + 4.2021*RMax[t-7] - 0.4531*RPeak[t-5].

Performance panel

Performance panel

The performance panel shows how accurate our models at the known part of data in terms of different error measures are.

Error measure Mean Root mean square
Absolute MAE: Mean absolute error RMSE: Root mean square error
Range percentage NMAE: Normalized mean absolute error NRMSE: Normalized root mean square error
Target percentage MAPE: Mean absolute percentage error RMSPE: Root mean square percentage error

In order to see the accuracy estimations in the first two sections (Post-processed predictions and Current model predictions) we should provide the Performance panel with at least a small number of actual values of the target variable that we are trying to model. This requirement is met if Preprocess is set to hold-out some instances from the Solver or, if we apply a model to a new data file.

The performance panel shows Maximal positive, Maximal negative, Mean absolute and Root mean squared values of error. Error values are either absolute or normalized by range of the output variable or normalized by values of the target variable. The range of target variables are always calculated only for data-points used for learning, i.e. data points that fall under training and testing parts.

For classification problems the number of model misses is measured. Integer values of target variables will be used as different classes for model performance measuring. Class A always corresponds to the smallest integer class, for example 0 if a two-class problem consists of classes 0 and 1.

Importance table

The importance tab shows the absolute number of times a certain model component was used in the set of obtained models. In case of a linear base model it represents the importance of plain variables.

Table of predictions

The table of predictions has the following columns:

#

Enumeration of prediction coincides with plot axes at the Plot tab.

ID

Unique data row identifiers.

Actual

Actual values of the target variable.

Predicted

Post-processed predictions of the target variable.

Curr. model

Predictions of a model that is currently selected in the Model browser.

Multi-target report

A multi-target report allows a user to see in one table target names, predictions, past performance and more.

You can use the configuration button to tune the appearance of a report. Also, you can save a multi-target report to html file or print it. To save it in the pdf format, we recommend using one of third party virtual pdf printers.

You are here: IntroductionSimulation results visualization
CC Attribution-Noncommercial 3.0 Unported
Valid CSS Driven by DokuWiki Recent changes RSS feed Valid XHTML 1.0