This shows you the differences between two versions of the page.
Both sides previous revision Previous revision Next revision | Previous revision | ||
import [2012/06/10 23:10] Oleksiy [CSV/XLS/XLSX Import] |
import [2021/06/01 03:27] (current) |
||
---|---|---|---|
Line 1: | Line 1: | ||
- | ====== Import ====== | ||
- | The import module reads data from data sources. | + | ====== Data connection ====== |
- | * CSV/ | + | |
- | * ODBC/OLEDB Import | + | |
- | | + | |
- | ==== CSV/ | + | **Data connection** module is used to set up connections with files or databases. GMDH Shell accesses data files or databases in the beginning of each forecasting launch. |
- | {{:img: | + | There are a number of data connection options: |
+ | * **CSV/ | ||
+ | * **ODBC/ | ||
+ | * **Order list connection** available only in the Business Forecasting package is used to extract historical demand from the raw list of customer orders. | ||
+ | * **Sage Accounting connection** available only in the Business Forecasting package is used to extract retail sales data from 'Sage 50' and other versions of Sage accounting software. | ||
+ | ===== CSV/ | ||
- | The module has support for | + | **File > New project > CSV/XLS/XLSX connection** is used to connect data files, this can be one or several Excel files or ASCI files with delimiters, for example, '' |
- | | + | * The module works with columns. If series |
- | * Tables that store variables as rows. | + | * The module cannot read password-protected spreadsheets. |
- | | + | |
- | | + | |
- | | + | |
- | * Reading row labels composed from two or three separate columns: Year + Month, Year + Week, Year + Quarter, Year + Month + Day. | + | |
- | * Detection | + | |
- | | + | |
- | * Reading records in backward order. | + | |
- | * Missing values detection. | + | |
- | :!: Limitations: | + | In the right part of the connection dialog there is a preview window with preliminary import results. It uses the following color-based indication of data types: |
- | * The module reads data only from the first sheet of .xls or .xlsx file.\\ | + | * Numeric cells - black. |
- | * The module can't read from password-protected files. | + | * Text cells (categorical |
+ | * Date/time cells - green. | ||
+ | * Missing values | ||
+ | * Discarded columns - gray cells. | ||
- | Data files may consist of numeral, categorical (text) or date/time columns. There is a color-based indication of column types: | + | {{ :img: |
- | * Categorical (text) - blue. | + | == Column labels == |
- | * Date/time - green. | + | |
- | * Missing values - light grey. (if the cell in not empty) | + | |
- | Column | + | **Column |
- | GMDH Shell does not provide tools for file editing but allows a user to keep data file opened | + | |
- | === Import dialog === | + | == Row labels (ID, timestamp) |
- | - Click on the Import button {{: | + | **Row labels** are used to refer to observations, **Row labels** must be unique like date and time or like identifiers. |
- | - Select one of your data files, press OK. Then the Import configuration dialog opens. | + | |
- | - In the Import dialog set the importing parameters, press OK. | + | |
- | - If your project folder doesn' | + | |
- | - In the Template selection dialog choose a relevant [[Templates|Template]] | + | |
- | If your project folder contains several data files, the Import module makes all of them available | + | If your dataset store timestamps elements (year, month, day, week and time) in separate columns, you can compose |
- | The file selected during the import procedure receives a special status '' | ||
- | Current file path will appear in the title bar of GMDH Shell window: | + | ==Missing values== |
- | {{: | + | Import |
- | === Import configuration === | + | **Missing value mark** is used to type in the missing mark or to select one of the standard marks. |
- | == Read column labels from the 1st row == | + | **Consider text cells as missing** |
- | Reads column names from the first row of data file(s). The number of elements in the first row is used for detection of data table width. | + | |
- | == Read row labels (ID, timestamp) from column N == | + | == Other settings |
- | If you have unique data row identifiers, | + | |
- | In case of multiple data files, the row labels will be taken from the '' | + | |
- | Quite often datasets have date marks such as year and month or week located in separate columns. Then you can compose timestamps from several columns using the option | + | **Delimiter** is used to set the delimiter type such as coma, space, tab, or any other. This option |
- | In this case all aggregated columns in the dataset must be neighboring and the option | + | |
- | ==CSV delimiter== | + | **Import all files with the same extension** is used to connect all files with the same extension within the current directory. |
- | Sets a delimiter type. Applicable | + | **Import all sheets of a workbook** is used to connect all sheets of one or many '' |
- | ==Missing value mark== | + | **Transpose tables, i.e. read columns from rows** is used to support data series formatted as rows (not columns). |
- | Import | + | **Reverse order of rows** |
- | ==Consider text cells as missing== | + | **Import rows starting from** is used to skip a number of rows in the top of the table, in particular this option is used to skip the header information. |
- | Replaces any non-numeric values with regular NULL values. | + | ==== ODBC/OLEDB connection ==== |
+ | **File > New project > ODBC/OLEDB connection** is used to connect various databases. Most database vendors provide at least a minimal ODBC driver with their database. This import module requires knowledge of SQL quires. | ||
- | ==== Data file examples ==== | ||
- | \\ | + | ==== Sage Accounting connection ==== |
- | {{: | + | **File > New project > Sage Accounting connection** is used to easily extract retail sales data from 'Sage 50' and other versions of Sage accounting software (www.sage.co.uk). The module is only available in the ' |
- | \\ | ||
- | |||
- | {{: | ||
- | |||
- | |||
- | |||
- | ==== ODBC/OLEDB Import ==== | ||
- | |||
- | This preprocessor will be available soon. | ||