This shows you the differences between two versions of the page.
Both sides previous revision Previous revision Next revision | Previous revision | ||
import [2013/11/08 22:43] Oleksiy |
import [2014/03/25 21:42] Oleksiy [Import] |
||
---|---|---|---|
Line 1: | Line 1: | ||
- | ====== Import ====== | ||
- | Import module is used to setup connection | + | ====== Data connection |
- | The following connections are available: | + | **Data connection** |
- | * **CSV/ | + | |
- | * **ODBC/ | + | |
- | * **Sage Accounting connection** is used to extract retail sales data from 'Sage 50' and other versions | + | |
- | + | ||
- | ==== CSV/ | + | There are a number of data connection options: |
+ | * **CSV/ | ||
+ | * **ODBC/ | ||
+ | * **Order list connection** available only in the Business Forecasting package is used to extract historical demand from the raw list of customer orders. | ||
+ | * **Sage Accounting connection** available only in the Business Forecasting package is used to extract retail sales data from 'Sage 50' and other versions of Sage accounting software. | ||
+ | ===== CSV/ | ||
- | **File > New project > CSV/ | + | **File > New project > CSV/ |
+ | * The module works with columns. If series of observations are rows (not columns), **Transpose data** option must be turned-on. | ||
+ | * The module cannot read password-protected spreadsheets. | ||
+ | In the right part of the connection dialog there is a preview window with preliminary import results. It uses the following color-based indication of data types: | ||
+ | * Numeric cells - black. | ||
+ | * Text cells (categorical data) - blue. | ||
+ | * Date/time cells - green. | ||
+ | * Missing values - light gray. (if the cell is not empty) | ||
+ | * Discarded columns - gray cells. | ||
- | The module has support for | + | {{ :img: |
- | * Tables that store variables as columns (by default). | + | |
- | * Tables that store variables as rows. | + | |
- | * CSV, XLS and XLSX file formats. | + | |
- | * Reading table headings from the the first row. | + | |
- | * Reading row labels from a column that contains unique Timestamps or IDs. | + | |
- | * Reading row labels composed from two or three separate columns: Year + Month, Year + Week, Year + Quarter, Year + Month + Day. | + | |
- | * Detection of text (categorical) variables. (blue columns) | + | |
- | * Detection of Date/time variables in the dataset.(green columns) | + | |
- | * Reading records in backward order. | + | |
- | * Missing values detection. | + | |
- | :!: Limitations: | + | == Column labels == |
- | * The module can't read from password-protected files. | + | |
- | Data files may consist of numeral, categorical (text) | + | **Column labels** are used as variable names, therefore you should either instruct GMDH Shell to **Use 1st row** for this purpose |
- | * Categorical | + | == Row labels |
- | * Date/time - green. | + | |
- | * Missing values - light grey. (if the cell in not empty) | + | |
- | Column names are allowed but not required. | + | **Row labels** |
- | GMDH Shell does not provide tools for file editing but allows a user to keep data file opened for editing. It is not required | + | |
- | === Import dialog === | + | If your dataset store timestamps elements (year, month, day, week and time) in separate columns, you can compose the timestamp from several columns. For this purpose use the option **Compose ID from several columns**. All timestamp elements in the dataset must be neighboring columns and the option **Read row lables from column N** must point to the first of them. Supported combinations are: **Year + Month**, **Year + Week**, **Year + Quarter**, **Year + Month + Day**, **Date + Time**. |
- | - Click on the Import button {{: | ||
- | - Select one of your data files, press OK. Then the Import configuration dialog opens. | ||
- | - In the Import dialog set the importing parameters, press OK. | ||
- | - If your project folder doesn' | ||
- | - In the Template selection dialog choose a relevant [[Templates|Template]] and press OK. | ||
- | If your project folder contains several data files, the Import module makes all of them available in the Data manager. Selection of just one file points importer module to the whole directory. | + | ==Missing values== |
- | The file selected during the import procedure receives a special status '' | + | Import |
- | Current file path will appear | + | **Missing value mark** is used to type in the missing mark or to select one of the standard marks. |
- | {{: | + | **Consider text cells as missing** is used to replace all non-numeric cells with regular NULL values. |
- | ==== Import configuration ==== | + | == Other settings |
- | {{: | + | **Delimiter** is used to set the delimiter type such as coma, space, tab, or any other. This option is applicable only to ASCI files with delimiters ('' |
- | == Read column labels from the 1st row == | + | **Import all files with the same extension** |
- | Reads column names from the first row of data file(s). The number of elements in the first row is used for detection of data table width. | + | |
- | == Read row labels (ID, timestamp) from column N == | + | **Import all sheets |
- | If you have unique data row identifiers, | + | |
- | In case of multiple data files, the row labels will be taken from the '' | + | |
- | Quite often datasets have date marks such as year and month or week located in separate columns. Then you can compose timestamps from several columns using the option **Compose ID from several columns**. | + | **Transpose tables, i.e. read columns from rows** is used to support data series formatted as rows (not columns). |
- | In this case all aggregated | + | |
- | ==CSV delimiter== | + | **Reverse order of rows** |
- | + | ||
- | Sets a delimiter type. Applicable to CSV files only. | + | |
- | + | ||
- | ==Missing value mark== | + | |
- | + | ||
- | Import | + | |
- | + | ||
- | ==Consider text cells as missing== | + | |
- | + | ||
- | Replaces any non-numeric values with regular NULL values. | + | |
+ | **Import rows starting from** is used to skip a number of rows in the top of the table, in particular this option is used to skip the header information. | ||
==== ODBC/OLEDB connection ==== | ==== ODBC/OLEDB connection ==== | ||
Line 90: | Line 66: | ||
**File > New project > Sage Accounting connection** is used to easily extract retail sales data from 'Sage 50' and other versions of Sage accounting software (www.sage.co.uk). The module is only available in the ' | **File > New project > Sage Accounting connection** is used to easily extract retail sales data from 'Sage 50' and other versions of Sage accounting software (www.sage.co.uk). The module is only available in the ' | ||
- | |||
- | |||
- | |||
- | |||
- | The import module can be selected in the toolbar | ||