Under the Hood
We don't keep any secrets about our core algorithms - you can find details at the documentation page: Learning algorithms. Additionally we suggest to read the Research Report about GMDH method (PDF) that we like very much.
GMDH Shell implements the following GMDH algorithms:
- Combinatorial GMDH
- GMDH-type neural networks
Our Combinatorial algorithm is similar to the classic combinatorial GMDH algorithm called COMBI. You can get its description from the above PDF.
Our Neural-type algorithm is a variation of the classic Multilayer Iterative GMDH algorithm (MIA). The difference between our Neural-type and original MIA algorithm is that there are a number of theoretical improvements introduced by different authors since 1968, so GMDH Shell implements all the features that have proven effective and have been partially included into competing software packages (KnowledgeMiner, NeuroShell). These features are: Active neurons, Neurons with more than 2 inputs, connections that skip layers and bi-criteria neuron selection.
Solving modeling problems:
- Multivariate time series forecasting
- Regression (continuous value prediction)
- Classification (prediction of a category)
- Ranking and selection of variables
- Polynomial curve fitting
Modeling simulation outputs the following results:
- A set of models that can be exported to Excel
- Predictions
- Importance of input variables
- Analysis of out-of-sample model accuracy
Predictive modeling work-flow:
- Create a model
- Save the model
- Export the model's formula to Excel (deploy a model)
- Load a model from a save-file
- Apply the model to unknown instances within the analyzed file
- Apply the model to a new data-file (scoring)
Embedded data exploration:
- File preview
- Descriptive statistics
- Line charts
- Bar charts
- Scatter plot
- Histogram
- Autocorrelation chart
- Pair-wise correlations with ranking
- Contour plot
- Heat map
- 3D surface
Data-file formats:
- CSV (and any other text files with delimiters)
- XLSX
- XLS
- File sets with the same extension
Data pre-processing:
- Visual handling of input and output (target) variables and data transformations
- Handling of missing values
- Converting categorical (text) data into numeric values (encoding and binary decomposition)
- Weighting of dataset rows (handling of imbalanced classification problems)
- Time series preprocessing (lags, differences, moving average, incremental weighting of dataset rows)
- Elementary functions (logarithmic transformation, normalization, etc.)
Dynamic post-processing:
- Average of top-ranked models
- Quantization of predictions
Miscellaneous:
- Background execution mode via the command line
- Dataset examples and project templates
- One-click result recalculation for dynamically updated data files
- Support for multi-core processors
- Support for clustered Linux systems (Enterprise edition)
