ONE DATA introduces multiple ways of inserting raw data, whether by uploading CSV files or connecting to existent databases. Such features also provide the user with a better visualization, datatype manipulation as well as providing initial statistics about the data (number of null values, count of distinct values, bar chart and whisker plots, etc.).
TABLE OF CONTENTS
- Data Tables
- Dataset Load and Dataset Save
- Upload from ETL-source
Browse Data Tables
Existent datasets can be accessed via the DATA TABLES tab. Each line presents an overview of the corresponding data table:
- Number of linked workflows
- Modules and projects containing the data table
- Owner of the data table
- Last modification date and time
- Origin (CSV, FRT, etc.)
Different actions related to specified data tables can also be performed, such as showing related notes and sharing the data table to a different project.
The following navigation tools are showed in the figure below:
- If the number of available data tables exceed the number of rows per page (here it's 25) the arrows could be used to navigate between the different pages.
- The search bar can be used to find an explicit data table by its name.
- The toggle button is used to include resources with no project assignment.
Other actions may be performed, such as upload or fetch from projects to add a data table, remove, delete or move to other projects to remove data tables and share to other projects.
Removing a data table removes it from the current project, while deleting it removes it from all existent projects.
Upload a Data Table
- Press "upload" to add a new data table to the current project.
- Press "SELECT .CSV OR .ZIP FILE" to upload a dataset.
If a correct and non-corrupt file is selected, a preview of the first lines of the data table will be displayed.
The file separator (Delimiter Token) is recognized automatically when uploading a dataset. Alternatively, it can be defined manually. The String Escape Token, which marks the beginning and ending of a string variable, is set to quotation marks (") by default and can also be changed.
Edit a Data Table
After selecting and uploading the right file, a preview of the data table is shown. The preview contains general information about the data table (creation date, modification date, origin). General modifications are also possible (set representation or scale for all variables).
Under the "Data table sample" tab, record samples are shown. It's possible to change the type or the column name for each variable independently.
General statistics about each column are shown under the "Data table statistics" tab.
Save a Data Table
ONE DATA provides the possibility of saving the output generated by workflows to new or existing data tables. This can be accomplished using the data table save processor.
(Locally) Download a Data Table
For your data tables in ONE DATA you also have the possibility to download them to your local computer - the download location relates to the settings of your respective browser. This is done in either of two ways:
- Directly in the Data Table Save Processor.
- In the overview of the data table.
To download the CSV directly in the Data Table Save Processor (after the respective workflow has been executed successfully!), open the processors's configuration and navigate to the "Raw Data/Data Table Notes" Tab. Hovering over the data table preview shows a button to the far right, which allows two possibilities of downloading the respective data table as CSV:
- Download Table as CSV (with Filters and Aggregation)
- Download Page as CSV
Using one of these download options will show the same prompt to specify CSV properties as can be seen in the picture a bit more below (CSV Download Settings).
Accessing the data table from a project overview shows the following screen, allowing for two download possibilities:
Here, data tables can be downloaded in one of two formats: CSV or EXCEL.
Choosing EXCEL as file format will directly download the respective data table to your system. For CSV files, a prompt is opened, which let's you define the structure of the CSV file to be created:
Thereby the settings that can be adjusted are the following:
- Decimal Separator Token: This token is used to define the delimiter that is applied for separating the integer part of the fractional part of a number entered in your data table.
Options are comma (as used in Germany, e.g. in "1,4") and full stops (as used in UK or the US, e.g. "3.5").
- Column Separator Token: This token defines the delimiter that is added between the columns of the data. Prominent examples are commas (e.g. "a,b"), semicolons (e.g. "a;b"), or more exotically pipes (e.g. "a|b").
- String Escape Token: This token is important whenever you have content in your data table that contains the character that you use for your "Column Separator Token", but you want to symbolize that the string still belongs together. This can be introduced by applying the "String Escape Token".
As an example imagine you have the entry "I want to download my data table as CSV, too!". If now the Column Separator Token is set to a comma, you would receive two entries in the CSV: "I want to download my data table as CSV" and "too!". With the String Escape Token, for example set to "&" (in order to make it more explicit), the result in the CSV would be one entry " &I want to download my data table as CSV, too!& ".
Please note that some software might not show certain characters, e.g. quotation marks in Excel. ONE DATA should however always add them whenever specified - you can sometimes check with other software, e.g. simple text editors.
- Escape Token: This token is used whenever you use characters that need escaping in certain encodings. Prominent examples would be the quotation mark ".
As an example, defining the Escape Token to "\" and having the entry " "escapes" " in your data table, would lead to the entry " \"escapes\" " in your CSV download.
- Encoding: The encoding that will be used for the downloaded CSV file (you can read about what encoding is here)..
- File Name: The name for the downloaded CSV file.
Dataset Load and Dataset Save
Data tables are the new datasets. The functionality of the similar processors is slightly different in terms of small improvements. Dataset related processors are deprecated and will be removed from ONE DATA in the short term.
With that being said, the use of data table processors is recommended.
Upload from ETL-source
Data Tables can also be created via a database connection. Connected database tables cannot be accessed directly. Instead, it is possible to copy their content into a data table. Further information and process steps are available under the following link.
Data tables generated from a database input can also be manipulated using the Flexible ETL Load processor.