KNIME User Training KNIME AG. Copyright 2017 KNIME AG
|
|
- Sheena Pierce
- 5 years ago
- Views:
Transcription
1 KNIME User Training KNIME AG
2 Overview KNIME Analytics Platform 1 2
3 What is KNIME Analytics Platform? A tool for data analysis, manipulation, visualization, and reporting Based on the graphical programming paradigm Provides a diverse array of extensions: Text Mining Network Mining Cheminformatics Weka machine learning Many integrations, such as Java, R, Python, etc. 2 3
4 Additional Resources KNIME pages ( SOLUTIONS for example workflows RESOURCES/LEARNING HUB RESOURCES/NODE GUIDE KNIME Tech pages (tech.knime.org) FORUM for questions and answers DOCUMENTATION for docs, FAQ, changelogs,... COMMUNITY CONTRIBUTIONS for dev instructions and third party nodes KNIME TV on YouTube 3 4
5 The KNIME Analytics Platform 4 5
6 Visual KNIME Workflows NODES perform tasks on data Inputs Outputs Status Not Configured Idle Executed Error Nodes are combined to create WORKFLOWS 5 6
7 Data Access Databases MySQL, PostgreSQL any JDBC (Oracle, DB2, MS SQL Server) Files Csv, txt Excel, Word, PDF SAS, SPSS XML PMML Images, texts, networks, chem Web, Cloud REST, Web services Twitter, Google 6 7
8 Big Data Spark HDFS support Hive Impala HP Vertica In-database processing 7 8
9 Transformation Preprocessing Row, column, matrix based Data blending Join, concatenate, append Aggregation Grouping, pivoting, binning Feature Creation and Selection 8 9
10 Analyze & Data Mining Regression Linear, logistic Classification Decision tree, ensembles, SVM, MLP, Naïve Bayes Clustering k-means, DBSCAN, hierarchical Validation Cross-validation, scoring, ROC Misc PCA, MDS, item set mining External R, Weka 9 10
11 Visualization Interactive Scatter plot, histogram, pie charts, box plot Highlighting (brushing) JFreeChart JavaScript Misc Tag cloud, open street map, networks, molecules External R 10 11
12 Deployment Database Files Excel, csv, txt XML PMML to: local, KNIME Server, SSH-, FTP-Server BIRT Reporting 11 12
13 Over 1500 native and embedded nodes included: Data Access MySQL, Oracle,... SAS, SPSS,... Excel, Flat,... Hive, Impala,... XML, JSON, PMML Text, Doc, Image,... Web Crawlers Industry Specific Community / 3rd Transformation Row, Column Matrix Text, Image Time Series Java Python Community / 3rd Analysis & Mining Statistics Data Mining Machine Learning Web Analytics Text Mining Network Analysis Social Media Analysis R, Weka, Python Community / 3rd Visualization R JFreeChart JavaScript Community / 3rd Deployment via BIRT PMML XML, JSON Databases Excel, Flat, etc. Text, Doc, Image Industry Specific Community / 3rd 12 13
14 Overview Installing KNIME Analytics Platform The KNIME Workspace The KNIME File Extensions The KNIME Workbench Workflow editor Explorer Node repository Node description Installing new features 13 14
15 Install KNIME Analytics Platform Select the KNIME version for your computer: Mac, Win, or Linux and 32 / 64bit Note different downloads (minimal or full) Download archive and extract the file, or download installer package and run it 14 15
16 Start KNIME Analytics Platform Go to the installation directory and launch KNIME, or use the shortcut created on your Desktop
17 The KNIME Workspace The workspace is the folder/directory in which workflows (and potentially data files) are stored for the current KNIME session. Workspaces are portable (just like KNIME) 16 17
18 Welcome Page 17 18
19 The KNIME Workbench Servers and Workflows Workflow Editor Node Recommendations Node Description Node Repository Console Outline Licensed under a Creative Commons Attribution
20 Creating New Workflows, Importing and Exporting Right-click Workspace in KNIME Explorer to create new workflow or workflow group or to import workflow Right-click on workflow or workflow group to export 20
21 KNIME File Extensions Dedicated file extensions for Workflows and Workflow groups associated with KNIME Analytics Platform *.knwf for KNIME Workflow Files *.knar for KNIME Archive Files 20 21
22 More on Nodes A node can have 3 states: Idle: The node is not yet configured and cannot be executed with its current settings. Configured: The node has been set up correctly, and may be executed at any time Executed: The node has been successfully executed. Results may be viewed and used in downstream nodes
23 Inserting and Connecting Nodes Insert nodes into workspace by dragging them from Node Repository or by double-clicking in Node Repository Connect nodes by left-clicking output port of Node A and dragging the cursor to (matching) input port of Node B Common port types: Model Image Flow Variable Data Database Conection Database Query 22 23
24 Node Configuration Most nodes require configuration To access a node configuration window: Double-click the node Right-click > Configure 23 24
25 Node Execution Right-click node Select Execute in context menu If execution is successful, status shows green light If execution encounters errors, status shows red light 24 25
26 Node Views Right-click node Select Views in context menu Select output port to inspect execution results Plot View Data View 25 26
27 Workflow Coach Recommendation engine It gives hints about which node use next in the workflow Based on KNIME communities' usage statistics Usage statistics available also with Personal Productivity Extension and KNIME Server products (these products require a purchased license) 26 27
28 Getting Started: KNIME Example Server Public repository with large selection of example workflows for many, many applications Connect via KNIME Explorer 27 28
29 Online Node Guide Workflows from Example Server also available online
30 Hot Keys (for future reference) Task Hot key Description Node Configuration F6 opens the configuration window of the selected node Node Execution Move Nodes and Annotations Workflow Operations F7 Shift + F7 Shift + F10 F9 Shift + F9 Ctrl + Shift + Arrow Ctrl + Shift + PgUp/PgDown F8 Ctrl + S Ctrl + Shift + S Ctrl + Shift + W executes selected configured nodes executes all configured nodes executes all configured nodes and opens all views cancels selected running nodes cancels all running nodes moves the selected node in the arrow direction moves the selected annotation in the front or in the back of all overlapping annotations resets selected nodes saves the workflow saves all open workflows closes all open workflows Meta-node Shift + F12 opens meta-node wizard 29 30
31 Importing Data Accessing files and databases 1 31
32 Data Source Nodes Typically characterized by: Orange color No input ports, 1-2 output ports Status Output port Node description 2 32
33 New Node: File Reader Workhorse of the KNIME Source nodes Reads text based files Many advanced features allow it to read most weird files Short lines, inline comments, headers and special encoding YouTube KNIME TV Channel video:
34 File Reader Configuration File path Basic Settings Advanced Settings Preview Help Button 4 34
35 Alternative Faster Way Drag & Drop OR Copy & Paste 5 35
36 Filenames and the knime:// protocol Absolute URL Mountpoint-relative URL Local path 36
37 Workflow Relative File Paths Best choice if workflows are to be shared Requires matching folder structure within workflow group Independent of environment outside of workflow group Example: Path to Sentiment Analysis.table Local path: C:\Users\rb\knime-workspace\KNIMEUserTraining\data\Sentiment Analysis.table Workflow relative: YouTube KNIME TV Channel:
38 New Node: Excel Reader (XLS) Reads.xls and.xlsx file from Microsoft Excel Supports reading from multiple sheets 8 38
39 Excel Reader Configuration File path Sheet specific settings Preview 9 39
40 New Node: Table Reader Reads tables from the native KNIME Format. Maximum performance, minimum configuration File path YouTube KNIME TV channel video:
41 Database Connectivity Read data from any JDBC enabled database Write your own SQL or model it using dedicated nodes 11 41
42 New Nodes: Database Connectors Native: Postgres, MySQL, SQLite, MSSQL (SQL Server) Database Connector (e.g. Oracle, DB2, HANA). Commercial extensions: HIVE and Impala 12 42
43 New Node: SQLite Propagate connection information to other DB nodes File-based database Useful for prototyping (switch to real connector on deployment) 13 43
44 New Node: Database Table Selector Take connection information and construct a query Explore DB metadata Output is an SQL query 14 44
45 New Node: Database Connection Table Reader Executes SQL Query Reads results into a KNIME table 15 45
46 Other Useful Data Sources PMML Reader reads standard predictive models XML Reader with XPATH support Python/R Source nodes SAS7BDAT (Labs) REST/SOAP for web services, and many more 16 46
47 Importing Data Exercise Starting with exercise: Importing Data Read the following files Sentiment Analysis.table Sentiment Rating.csv Product Data2.xls Optional: Read table web_activity from the database WebActivity.sqlite (hint: drag and drop the file to your workflow to get started) 17 47
48 Data Manipulation Clean, join, aggregate 1 48
49 Data Manipulation Nodes Yellow color with a variety of input and output ports Apply a transformation to input data Many, many nodes! 2 49
50 New Node: Concatenate Combine rows from 2 tables with shared columns Handles duplicate row keys gracefully Take the union or intersection of columns 3 50
51 New Node: Cell Replacer Replaces the content of a column based on a lookup Top port references the table to be searched Bottom port holds the lookup table (search keys and replacement values) 4 51
52 New Node: String Manipulation Create and edit values in String columns Clean up capitalization (eg. Lowercase) Modify existing strings or create new columns 5 52
53 Data Manipulation Exercise, Activity I Starting with exercise: Data Manipulation, Activity I Concatenate web activity data from old and new systems Replace sentiment evaluation (strings) with corresponding numeric values Use String Manipulation to ensure that all entries of the Products column are lower case from the product data spreadsheet. 6 53
54 Joining Columns of Data Left Table Join by ID Right Table Inner Join Left Outer Join Missing values in the right table. Missing values in the left table. Right Outer Join 7 54
55 Joining Columns of Data Left Table Join by ID Right Table Full Outer Join Missing values in the left table. Missing values in the right table. 8 55
56 New Node: Joiner Combines columns from 2 different tables Top port contains Left data table Bottom port contains the Right data table 9 56
57 Joiner Configuration Linking Rows Values to join on. Multiple joining columns are allowed
58 Joiner Configuration Column Selection Columns from left table to output table Columns from right table to output table 11 58
59 Data Aggregation aggregated on group by method: sum( value ) 12 59
60 New Node: GroupBy Aggregate to remove duplicates or summarize data First tab provides grouping options Second tab provides control over aggregation details Aggregation columns Aggregation methods YouTube KNIME TV video:
61 In-database Data Manipulation Model SQL query using nodes DB versions of GroupBy, Joiner, Row Filter, Sorter, etc
62 Comments & Annotations Double-click to write Right-click to change properties Double-click to write Right-click to change properties YouTube KNIME TV Channel:
63 Workflow Organisation Good Practices Workflow annotations Node labels Metanodes Right click -> Collapse... Organize workflow by task Hide complexity & improve readability 16 63
64 Data Manipulation Exercise, Activity II Starting with exercise Data Manipulation, Activity II Join all data together using a series of joiner nodes and the Customer Key field Resolve duplicates in the joined dataset (hint: GroupBy node) Clean up and document your workflow using annotations, node labels and metanodes 17 64
65 Data Visualization Charts and tables 1 65
66 Data Visualization 2 66
67 Data Visualization Interactive plots and tables (with Highlighting) JavaScript integration for interactive views R View nodes for building advanced graphics in KNIME 3 67
68 New Node: Color Manager One of several visual property managers (e.g. size, shape) Color by nominal or continuous values Sync colors between views using the color model port 4 68
69 Hiliting Hilited data is visible across all views Keep multiple views open to explore complex data 5 69
70 New Node: Scatter Plot Plot different columns on X and Y Displays data including pre calculated visual properties (size, shape, color) Supports highlighting Produces a view, no image 6 70
71 New Node: JavaScript Scatter Plot Plot different columns on X and Y Displays data including pre calculated visual properties (size, shape, color) Does not support highlighting Produces an interactive view and an image 7 71
72 New Node: JavaScript Scatter Plot 3 configuration tabs 8 72
73 New Node: Scatter Plot (JFreeChart) Plot different columns on X and Y Displays data including pre-calculated visual properties (size, shape, color) Does not support highlighting Produces a static view and an image 9 73
74 New Node: Interactive Table No graphics Supports highlighting 10 74
75 Other Nodes: R View R View nodes for maximum customizibility 11 75
76 Visualization Exercise Start with exercise: Visualization Color data by Product Produce a scatter plot of Age vs. Estimated Yearly Income In the Scatter Plot node highlight some data points and view only highlighted points using an Interactive Table node Optional: Visualize data with the R (View node) (start with a script like: plot(knime.in) 12 76
77 Data Mining Partition, learn, predict, score 1 77
78 Data Mining Strategies Example applications: Anomaly Detection (fraud, predictive maintenance) Association Rule Learning (market basket analysis) Clustering (market segmentation) Classification (next best offer, churn preventions) Regression (trend estimation) 2 78
79 Data Mining: Process Overview Original Data Set Training Set Train Model Apply Model Score Model Test Set Partition data Train and apply models Evaluate performance 3 79
80 Data Mining in KNIME KNIME has many modeling tools! Decision tree, random forest, SVM, regression, neural networks, clustering, and integrations with other libraries: WEKA, libsvm, R, Python (scikit-learn) etc. And many model evaluation nodes ROC, standard, numeric and entropy scorers Feature elimination Cross validation 4 80
81 New Node: Partitioning Use to split data into training and evaluation sets Partition by count (e.g. 10 rows) or fraction (e.g. 10%) Sample by a variety of methods; random, linear, stratified 5 81
82 Learner-predictor Motif Most data mining approaches in KNIME use a Learner-predictor motif. The Learner node trains the model with its input data. Trained Model The Predictor node applies the model to a different subset of data. New data! 6 82
83 Classification Predict nominal outcomes on existing data (supervised) Applications Churn analysis (yes/no) Chemical activity (active/inactive) Spam detection (spam/not spam) Optical character recognition (A-Z) Methods Decision Trees Neural Networks Naïve Bayes Logistic Regression 7 83
84 KNIME s Decision Tree J.R. Quinlan, C4.5 Programs for machine learning J. Shafer, R. Agrawal, M. Mehta, SPRINT: A Scalable Parallel Classifier for Data Mining C4.5 builds a tree from a set of training data using the concept of information entropy. At each node of the tree, the attribute of the data with the highest normalized information gain (difference in entropy) is chosen to split the data. The C4.5 algorithm then recurses on the smaller sub lists. 8 84
85 New Node: Decision Tree Learner 9 85
86 Decision Tree View Most unmarried people earn < 50K per year 10 86
87 New Node: Decision Tree Predictor Takes a decision tree model & applies it to new data Check the box to append class probabilities 11 87
88 New Node: Scorer Compare predicted results to known truth in order to evaluate model quality Confusion matrix shows the distribution of model errors An accuracy statistics table provides a detailed analysis of model quality
89 New Node: Scorer True Positives Income = <=50K Predicted = <=50K False Positives Income = <=50K Predicted = >50K False Negatives True Negatives 13 89
90 Scorer: Accuracy Measures 14 90
91 Receiver Operating Characteristics Sort by confidence in target class Plot true positive rate vs false positive rate Ideal models achieve 100% TPR with 0% FPR Area under the curve indicates model quality (1=ideal model, 0.5 = random outcome) 15 91
92 New Node: ROC Curve Requires individual class probabilities from a preceding predictor User must define: 1. Original class column 2. Positive class value 3. Probability for that class from 1 or more models See also the JavaScript ROC Curve node 16 92
93 Data Mining Exercise, Activity I Starting with exercise: Data Mining, Activity I: Partition the fully joined data 50%, Stratified Sampling Train a decision tree on the training data (Learn against Target column) Use the model to predict the upsell potential for remaining records. Evaluate the quality of a model with a Scorer. Optional: Find AUC for the model using ROC curve
94 Regression Predict numeric outcomes on existing data (supervised) Applications Forecasting Quantitative Analysis Methods Linear Polynomial Regression Trees Partial Least Squares 18 94
95 New Nodes: Linear Regression Learner & Regression Predictor A linear model relating a dependent variable to 1 or more independent variables Model coefficients provided in 2nd output port Also available: Polynomial and Tree Ensemble Regression nodes 19 95
96 New Node: Numeric Scorer Similar to scorer node, but for nodes with numeric predictions (e.g. linear/polynomial regression) Compare dependent variable values to predicted values to evaluate goodness of fit. Report R 2, RMSD, SEM etc
97 Data Mining Exercise, Activity II Starting with exercise: Data Mining, Activity II: Partition the fully joined data 50%, Stratified Sampling Train a linear regression model that predicts age as a function of some other parameters in the data set Use the model to predict the age of the remaining users Evaluate the quality of a model with a Numeric Scorer. Is this model useful for predicting customer age? 21 97
98 Clustering Discover hidden structure in unlabeled data (unsupervised) Applications Market Segmentation Diversity picking Methods K-means/medoids Hierarchical DBScan Neighbourgrams 22 98
99 New Nodes: k-means Clustering Looks at n observations to define the means for k clusters. Each observation is then assigned to its closest cluster center. You must provide k
100 New Node: Entropy Scorer Similar to scorer node, but used with unsupervised learning (no target to predict) Cluster labels and reference clusters do not need to be in the same domain (e.g. Match Cluster 1 to iris setosa ) Reports entropy based statistics which indicate model quality (low entropy, high quality is the aim)
101 Data Mining Exercise, Activity III Start with exercise: Data Mining, Activity III Read the location_data.table file Filter to entries from California (region_code = CA) Train a k-means model with k=3. Use only position data for clustering (latitude and longitude) Evaluate with Entropy Scorer. Compare cluster labels to the City column. Hint: use the k-means output for both ports in the input to the scorer. Optional: Plot latitude and longitude in a view (OSM Map or Scatter Plot) and use that to help you visually optimize k
102 Integrating External Tools 1 102
103 Goal of This Session This session gives a quick overview of the external tools that can be called within KNIME, e.g.: Java, R, any external tool of choice, web service, and so on
104 KNIME Labs KNIME Labs enable you to preview new KNIME features and plug-ins that are still under development. The nodes provided in KNIME Labs (KNIME Tech) are not (yet) part of the official KNIME version because the functionality and/or API may not be finalized. You can get these plug-ins by installing the KNIME Labs extension package. Most of the plug-ins come already with a detailed description and some workflow examples
105 Java Snippet Fastest running scripting node in KNIME Syntax highlighting, auto completion, error checking Templates allow you to save scripts for later re-use Import custom libraries 4 105
106 Java Edit Variable Same as Java snippet, but without table ports 5 106
107 R Integration Run R inside KNIME. Works with existing R installations. Nodes for many tasks, but all with similiar interfaces First run: install.packages( Rserve ) and install.packages( Cairo )* *mac only 6 107
108 R Integration Syntax Highlighting Create and store templates Select R workspace Evaluate script R error messages Show Results 7 108
109 Python Integration Run Python inside KNIME Works with existing installations UI modeled after R integration 8 109
110 Python Scripting UI 9 110
111 RESTful Web Services In KNIME Labs: JSON Response: XML Response:
112 RESTful Web Services
113 KNIME Server as a REST resource
114 KNIME Server as a REST resource Use curl, SOAPUI or Chrome extension Postman to explore the REST API
115 Generic Web Service Client (SOAP) X resets all fields Input Values Analyze parses the wsdl file and automatically fills the details Output Values Select to include
116 External Tool Node
117 JSON and JSONPath Use the JSON Reader (or the GET Resource) nodes to get an JSON cell Use JSONPath nodes to query the JSON and extract certain parameters Editor window simplifies construction of JSON queries by auto-generating them (click on properties)
118 JSONPath
119 XML and XPath Use the XML Reader (or the GET Resource) nodes to get an XML cell Use XPath nodes to query the XML and extract certain parameters Editor window simplifies construction of XPath queries by auto-generating them (click on XML elements)
120 XPath
121 Hive, HDFS, Spark Architecture
122 Big Data Connector Extensions
123 Query Big Data Platform with Hive
124 Loading Data to Hive
125 Spark Functionality
126 Machine Learning in Spark
127 Exercises Use the REST nodes to call an external web service Choose either books.json or books.xml and use the appropriate tools to extract the book name, author and title Programmers: choose your favorite from Java, Python and R Take the KNIME table and multiply all values in one column by 10. Python/R: Build a simple decision tree learner Alternatively use the Generic JavaScript node and D3.js to create a custom JavaScript visualization that can be viewed in KNIME Analytics Platform, and KNIME WebPortal
128 Exporting Data & Deployment 1 128
129 Exporting Data After an analysis is completed, what next? Write results to a file Create/update a database Save the model for use elsewhere Generate a rich report Deploy via KNIME WebPortal Deploy via workflow as RESTful web service 2 129
130 Input/Output in Deployment Input File (CSV, Table, XLS, ) Database JSON for REST API Output Report (BIRT, Tableau, Spotfire) File (CSV, table, XLS, ) Dashboard on WebPortal 3 130
131 To Report / To BIRT Report Also available Nodes to Tableau and Spotfire 4 131
132 To File / Database 5 132
133 REST API on Server 6 133
134 To Dashboard on WebPortal Step 1 Upload File Step 2 Dashboard 7 134
135 Workflow on KNIME WebPortal Step 1 Upload File Step 2 Dashboard Available in KNIME Server 8 135
136 Wrapped Node to produce Dashboard on Web Page Interactive Table / Plots on web page Bar Chart HTML Text 9 136
137 Data Export Nodes Typically characterized by: Magenta color 1 input port, no output ports Create file on file system or write to database
138 New Node: Table Writer
139 New Node: XLS Writer
140 New Node: Database Writer Only if no Connector node
141 New Node: Database Update
142 Reporting in KNIME Reporting in KNIME is done via a 3 rd party application named BIRT (Business Intelligence Reporting Tool) Data is sent to BIRT from KNIME using special nodes. Reports in BIRT are constructed from report items, which may include images, tables, charts and labels. Reports may be generated in a variety of formats (html, pdf, pptx, xlsx, docx)
143 New Node: Data to Report Send a data table to BIRT
144 New Node: Image to Report Send an image to BIRT PNG and SVG are supported formats (see node description for details)
145 Edit the Report Open the workflow > Click the Report Editor button in the tool bar Licensed under a Creative Commons Attribution
146 Reporting Perspective Data from KNIME Views Report Layout Report Items
147 Charting in BIRT Many chart types Fine control of plot appearance Familiar Excel Like interface Supports interactivity
148 Exporting Data Exercise Starting with exercise: Exporting Data Write your predictions to disk as a KNIME table Create a heatmap of the normalized confusion matrix and send it to BIRT Send your model accuracy table to BIRT Define a very simple report showing the accuracy of your model Generate a PDF of your report
149 Flow Variables 1 149
150 Goal of this Session What is a Flow Variable? Create a Flow Variable Use a Flow Variable as a parameter in the node settings Use a quickform node to parameterise a Wrapped Metanode 2 150
151 Flow variables: Usage example Each month I am asked to produce a sales report for the most popular product Filter only rows With Products = Gold Investment 3 151
152 Flow variables: Usage example Each month I need to launch the Analytics Platform, run the workflow up to the GroupBy node, then get the most popular product and update the Row Filter. Or do I? Perhaps flow variables can help 4 152
153 Automatically filter by most popular product Count products, and put most important at the top of the list Create a variable containing the most popular product Pass the flow variable to the row filter 5 153
154 Flow Variable Ports 6 154
155 Apply a Flow Variable (button) The Flow Variable button 7 155
156 Apply a Flow Variable (advanced) The Flow Variable tab List of available Flow Variables 8 156
157 Create a Flow Variable (button) Name of the new Flow Variable 9 157
158 Create a Flow Variable (advanced) Converting a setting value into a Flow Variable Name of the new variable
159 Summary: Flow Variables Flow Variables are workflow parameters used to overwrite existing node settings. A Flow Variable is carried along workflow branches (parallel branches don t share local flow variables). Flow Variables can be of type String, Integer, or Double. Flow Variables can be created in the Flow Variables tab of any node, using the Table Row to Variable Node or using QuickForms
160 New Node: Sorter Sorts a table! Choice of ascending or descending Sort by multiple columns
161 Flow Variables Exercise: Activity I Starting with exercise: Flow Variables Create a workflow that filters the products to contain only rows for the Gold Investment product. Find the most common product type and automatically filter the table according to that value
162 Quickform Nodes for Variable Creation and Output
163 Quickform Node Configuration Use Quickforms to create Flow Variables Default value is selected (here P+B Investment)
164 Wrapped Metanodes Similar to Metanodes Differ in key areas: Limited variable scope (c.f. global scope for Metanodes) Use with Quick Form nodes (Analytics Platform 3.1+) Key to advanced functionality in KNIME products: Use for new WebPortal pages
165 Metanodes vs. Wrapped Metanodes Metanodes Quick Forms Legacy Standard Variable scope Global Local Wrapped Metanodes* WebPortal Execution Old New (work with loops/switches) JavaScript views in WebPortal Not supported Supported WebPortal Usage Quickforms used globally Views/Quickforms must be embedded in a Wrapped Metanode Recommended uses Legacy workflows New developments Compatibility KNIME Server 3.x/4.x KNIME Server 4.2+ * Valid for KNIME Analytics Platform 3.1 and above
166 Create a Wrapped Metanode Select nodes to wrap Right-click a node Choose Encapsulate into Wrapped Metanode
167 Simple configuration of Wrapped Metanode Right-click a Wrapped Metanode to configure Use in WebPortal
168 Configure Wrapped Metanode Ports Add ports to metanodes or remove them to adapt to changes after creation of metanode
169 Passing Variables from Wrapped Metanodes Flow variables by default only available locally within wrapped metanode If needed, they can be passed to context outside metanode
170 Workflows with Reports in the KNIME WebPortal Step 1 Select workflow to run, configure notification, and report formats Step 2 Run settings and set of workflow parameters Last Step See results and report Optionally: Embed Interactive JavaScript visualizations Available in KNIME Server
171 Flow Variables Exercise: Activity II Starting with exercise: Flow Variables Create a Wrapped Metanode that allows users to filter records by choosing a value from the Products column. 171
172 Date/Time Data 1 172
173 Date & Time Overview Dedicated data type for date and time data Supported in Date&Time nodes (and others: GroupBy, Pivot, Line Plot) Complete re-write in KNIME
174 New Node: String to Date&Time Convert date/time data from string into a native Date&time cell Automaticall guesses correct format for many different types of date formatting Enter format manually if autoguessing didn t work KNIME automatically adds custom formats to auto-guess list Convert multiple columns of same date format in one node Select columns to transform Enter date format manually Select type of output column Click to auto-guess format 3 174
175 Date&Time Data Types Date Time Date & Time Date & Time + Time zone 175
176 New Node: Date&Time-based Row Filter Filter rows from a specified time period Range can be limited on upper bound, lower bound or both Options for end point: Date&Time: Fixed data and time Duration: Duration string (e.g. 2y 3M) Numerical: Select granularity from dropdown and enter number 5 176
177 New Node: String to Duration Takes a string and converts it to a duration cell Three different options to format input strings Example: Convert 1 year, 2 months, 3 weeks, and 4 days to duration cell ISO-8601: P1Y2M3W4D Short letter: 1y 2M 3w 4d Long word: 1 year 2 months 3 weeks 4 days 6 177
178 Duration-based Filtering Date&Time-based Row Filter allows to extract time periods From the start date, select all rows within the defined period Hours, days, weeks, months, etc. 1 Month 7 178
179 New Node: Date&Time Difference Choose desired resolution (days, hours, minutes, etc.) Check the difference between a time column and Another time column Execution time User defined time Time from previous row To calculate difference to second column, both columns need to have the same type! 8 179
180 New Node: Date&Time Shift Shifts date or time by either a Duration or a numerical value Use Duration: Use duration column Or shift by user-defined value E.g. 1y, 2M, 5h, etc. Use Numerical in combination with user-defined granularity Use numerical column Or shift by user-defined value Select granularity 180
181 New Nodes: Modify Time / Modify Date Modify Date&Time columns Three options: Append time (date) to to date (time) column Change time (date) to fixed value Remove time (date) from Date&Time column Column selection shows only columns suitable for currently selected option 181
182 Modify Time / Modify Date: Options Modify Time: Available options Modify Date: Available options Input: Date Input: Date Append time N/A Input: Time Input: Time N/A Append date Input: Date&Time Input: Date&Time Change time; Remove time Change date; Remove date Input: Date&Time (Time zone) Input: Date&Time (Time zone) Change time; Remove time Change date; Remove date 182
183 New Node: Modify Time Zone Similar to Modify Time/Modify Date Input: Date&Time Set time zone Input: Date&Time (Time zone) Set time zone Shift time zone Remove time zone Select time zone from dropdown list 183
184 Date and Time Analysis Exercise, Activity I Starting with exercise: Date and Time Analysis, Activity I Read the file: meter_data.csv Use String Manipulation to construct a date/timestamp column from the individual date and time columns Convert the date/time stamp column to a KNIME timestamp with the String to Date&Time node Extract entries from January 2007 using the Date&Time-based Row Filter
185 New Node: Extract Date&Time Fields Extract date fields (year, day, month) or time fields (hour, minute, second) from a date&time cell. Pick and choose which fields to include Useful when used in combination with data aggregation nodes (groupby, pivot etc.)
186 New Node: Moving Average Effectively a smoothing node Smoothing defined by a window type (centered, forward or backward) and weighted or not Useful when plotting aggregated time-series data to more easily see trends
187 New Node: Moving Aggregation Blend of GroupBy + Moving Average Functionality Group by moving window Aggregate using standard KNIME methods
188 New Node: JavaScript Line Plot Line plot with support for Date columns
189 Date and Time Analysis Exercise, Activity II Starting with exercise: Date and Time Analysis, Activity II Read the file: sampled_meter_data.table Use Extract Date&Time Fields to pull out the Year and Day of Year and Hour values from the timestamp Use GroupBy to aggregate the data by year, day and hour. Group by these values and calculate mean time and intensity Calculate a Gaussian centered moving average for the Intensity column and plot this data in a line plot Calculate the Maximum of the intensity column for the preceding day (1440 minutes) Optional: Plot the raw intensity, moving average, and moving maximum in a Line Chart
190 Workflow Control Loops, switches, try-catch 1 190
191 Workflow Control Structures Loops Iterate over a workflow snippet with variable inputs. Switches Direct the path of a workflow by selectively executing one or more workflow branches. Try-Catch Handle workflow branches that may fail in execution and you don t know before execution 2 191
192 The Loop Block A loop block is defined by appropriate loop start and loop end nodes. Loop body = Nodes in between and side branches. Loop body Loop start node Loop end node 3 192
193 New Node: Group Loop Start Similar to GroupBy except without aggregation tab. Each iteration of the loop passes the next group of rows. You implement the aggregation task. It can be anything from a complex calculation to updating a database
194 New Node: Create File Name Inputs: Directory, base file name, file Extension Output: Flow variable -> use as input for e.g. writer node 5 194
195 Example: Writing aggregated files Group Loop Start Variable Loop End Group data by specific column values Iterate over all groups of data Create an appropriate file name Write grouped data to tables with new file name 6 195
196 Workflow Control Exercise, Activity I Starting with exercise: Workflow Control, Activity I Read the file: CurrentDetailData.table Group over all of the values in the Products column For each group of data, write a new KNIME table to your KNIME Workspace. Give it an appropriate filename. (Hint: Group Loop Start creates a flow variable naming the current group) 7 196
197 New Node: List Files List all files in a directory Restrict to: top level directory (i.e. not recursive), specific file extensions matching name patterns (regex or wildcard) Provides file references as a table of URLs and absolute paths 8 197
198 New Node: TableRow to Variable Loop Start Similar to TableRow to Variable node. Each iteration of the loop converts the next row of the input table into flow variables. Inject variables into other nodes (often file readers) to reexecute subflows with a progression of settings 9 198
199 Example: Reading Many Files List all files in a directory Convert each file to a flow variable (1 per iteration) In each iteration, read the file, collect the results
200 Workflow Control Exercise, Activity II Starting with exercise: Workflow Control, Activity II Use List files to find the file names of the files created in exercise 2 Iterate over that list of files, and read them into a single KNIME Table
201 Switches A switch allows you to selectively activate branches of a workflow Inactive branches are marked with a red x on their output ports. Inactive nodes propagate down stream. Inactive Active
202 New Node: Rule Engine/Rule Engine Variable Define custom logic to using simple rules. Rules like: <Antecedent> => <Consequence> (1=1 => true ) May be used in flow variables or tables Easiest way to encode logic for switches
203 New Node: If Switch Control which branches of your workflow are active programatically. Controlled with a flow variable, setting the value to the literal strings: top, bottom, both May be used in flow variables or tables (different nodes)
204 Workflow Control Exercise, Activity III Starting with exercise: Workflow Control, Activity III Read the data file: CurrentDetailData.table Use a Single Input Quickform to let a user choose the values scatter or bar Use a Rule Engine Variable node to convert that into a format readable by an IF Switch ( top or bottom ) Use an IF Switch to create either a scatter plot or a bar plot depending on the input
205 Try-catch A way to catch errors in workflows. Useful when it is hard to know if a node will execute (for example, when connecting to a web service). KNIME tries to execute the nodes, but if it fails will fall back to an alternate branch
206 Streaming Standard execution: Node by node. Node processes all data, finishes, then passes data to next node, etc. Streaming: Nodes executed concurrently, each nodes passes data to the next as soon as it is available, i.e. before node is fully executed Faster execution, esp. for reading/preprocessing data Create wrapped metanode -> Configure -> Job Manager Selection -> Simple Streaming Not available for all nodes (show in node repository) Can only execute entire metanode, not individual nodes Intermediate results not available since nothing is cached
207 Streaming
208 Advanced Data Mining Random Forest, Tree Ensembles, Parameter Optimization, Cross Validation 1 208
209 New Nodes: Random Forest Ensemble learning method for classification and regression tasks It consists of a chosen number of decision trees Each of the decision tree models is learned on a different set of rows (records) and a different set of columns (describing attributes) 2 209
210 KNIME s Tree Ensemble models The general idea is to take advantage of the wisdom of the crowd : combining predictions from a large number of weak predictors leads to a more accurate predictor. This is called bagging. X P1 P2 Pn y Typically: for classification the individual models vote and the majority wins; for regression, the individual predictions are averaged 3 210
211 How does bagging work? Pick a different random subset of the training data for each model in the ensemble (bag). Build tree Build tree Build tree
212 An extra benefit of bagging: out of bag estimation Allows testing the model using the training data: when validating, each model should only vote on data points that were not used to train it X 1 X P1 P2 Pn P1 P2 Pn y 1 OOB y 2 OOB 5 212
213 Random Forests Bags of decision trees, but an extra element of randomization is applied when building the trees: each node in the decision tree only sees a subset of the input columns, typically N. Random forests tend to be very robust w.r.t. overfitting (though the individual trees are almost certainly overfit) Extra benefit: training tends to be much faster Build tree
214 New Nodes: Random Forest The output model describes a random forest and is applied in the corresponding predictor node using a simple majority vote. Tree Ensemble Learner node provides more functionalities 7 214
215 Gradient Boosting Another algorithm for creating ensembles of decision trees Starts with a tree built on a subset of the data Builds additional trees to fit the residual errors Typically uses fairly shallow trees Can introduce randomness in choice of data subsets ( stochastic gradient boosting ) and in variable choice
216 Out-of-bag vs. leave out model scoring Results from predicting quality of white wine 9 216
217 Tree Ensembles Random Forest variant More options to set Trees may be trained using subsets of rows and/or columns and this approach may lead to greater accuracy
218 Tree Ensembles Optimization of a tree ensemble is complex due to a surplus of configuration options (number of models, number of columns, number of rows, tree depth etc.)
219 New Nodes: Tree Ensemble Learner/Predictor Choose which columns to include Configure a prototype tree (depth, split criteria etc.) Setup ensemble parameters (model count, row/column subsampling)
220 Advanced Data Mining Exercise, Activity I Starting with exercise: Advanced Data Mining, Activity I Read the data file CurrentDetailData.table Partition the data 50/50 using stratified sampling on the Products column Create a Tree Ensemble model to predict the Products column Use a tree depth of 5, 50 models, and 75% of rows and columns for each iteration. Calculate the overall accuracy of the model and filter that value into a flow variable
221 Parameter Optimization Some modeling approaches are very sensitive to their configuration. Calculating optimum settings is not always possible. Parameter Optimization loops may help find a good configuration
222 New Node: Parameter Optimization Loop Start Define some parameters to optimize Set upper/lower bounds and step sizes (and flag integers) Choose an optimization method Brute force for maximum accuracy but slower computation Hill climbing for better faster runtimes but may get stuck in local optimum settings
223 New Node: Parameter Optimization Loop End Collects some value to optimize in a flow variable. Value may be maximized (accuracy) or minimized (error)
224 Advanced Data Mining Exercise, Activity II Starting with exercise: Advanced Data Mining, Activity II Add a parameter optimization loop to your Tree Ensemble Model Use Hill climbing to determine the optimum number of models (min=10,max=200,step=10, int = yes) Maximize the accuracy in the loop and node. What were the optimal settings? (Hint: don t forget to use the flow variable in your learner)
225 Cross Validation Used to evaluate model stability. Re-execute the modeling process many times using different data partitions. Collect aggregated statistics on model accuracy
226 Example: Cross Validation X-Partitioner X-Aggregator X-Partitioner replaces Partition X-Aggregator replaces Scorer Can be used with any learner/predictor
227 Advanced Data Mining Exercise, Activity III Starting with exercise: Advanced Data Mining, Activity III Create a 10-fold cross validation for your Tree Ensemble Learner. Calculate the mean error for the cross validation. Does the model seem stable?
228 Model Selection Compare and deploy 1 228
229 Model Selection We can build many models, but which should we use? Probably the best one Most accurate?, Best ROC?, Other criteria? In this section we will: Bring our models and performance indicators into a single table Sort that table to choose the best model Extract the best model and use it to make predictions 2 229
230 New Nodes: PMML to Cell/Cell to PMML Insert PMML models to or extract them from a KNIME table cell
231 New Node: Constant Value Column Insert a constant value as a new column to the table 4 231
232 New Node: PMML Predictor Score a table using a standard PMML model No configuration required 5 232
233 Model Selection Exercise Starting with exercise: Model Selection Train a decision tree on all the data. Partition and score the data as in the logistic regression example Join this model to the accuracy from the scorer. Label your model and concatenate this and other models together. Convert the best scoring model back to a model port. Predict new results for CurrentDetailData.table 6 233
234 Supplemental Workflows 1 234
235 Webportal Enabled Data Mining Builds and scores model on uploaded data Quickform nodes for stepwise configuration of workflow 1. Upload data file 2. Select target and features 3. Specify training / test fraction 4. Create downloadable PMML file 5. Report generation 2 235
236 WebPortal Enabled Data Mining 3 236
237 Webportal Enabled Data Mining Execution on Server via WebPortal
238 Webportal Enabled Data Mining Execution on Server via WebPortal
239 Webportal Enabled Data Mining Execution on Server via WebPortal
240 Webportal Enabled Data Mining Execution on Server via WebPortal
241 Webportal Enabled Data Mining Execution on Server via WebPortal
242 Webportal Enabled Data Mining Execution on Server via WebPortal
243 RESTful Geolocation Try Catch Block REST call to get lat/long for IPs
244 RESTful Geolocation Translates IPs to geo coordinates via RESTful service GET Resource: access RESTful API via GET IP to geo coordinates (lat/lon) Read REST Representation: parse REST result JSON, XML, CSV, Try Catch nodes to log errors gracefully
245 Geographic Analysis
246 Geographic Analysis Reads IPs from download weblog and related geo-coordinates Aggregates downloads by city, country, and US states OSM Map View to visualize geo-coordinates OSM Map to Image to create image of map view
247 Text and Network Mining KNIME Forums Create forum user interaction network Extract important terms, topics from forum sections
248 Text and Network Mining KNIME Forums
249 Text and Network Mining KNIME Forums Crawls KNIME forums and creates report Text Mining: tagging of users, topics, nodes Text Mining: tag cloud creation Network Mining: creation of user interaction networks Network Mining: visualization of user networks
250 Open Session
251 The End
KNIME Analytics Platform Course for Beginners
KNIME Analytics Platform Course for Beginners KNIME AG Overview KNIME Analytics Platform 1 2 What is KNIME Analytics Platform? A tool for data analysis, manipulation, visualization, and reporting Based
More informationInstallation KNIME AG. All rights reserved. 1
Installation 1. Install KNIME Analytics Platform (from thumb drive) 2. Help > Install New Software > Add (> Archive): 00_InstallationFiles/CommunityContributions_trunk.zip https://update.knime.org/community-contributions/trunk
More informationKNIME Big Data Training
KNIME Big Data Training education@knime.com Overview KNIME Analytics Platform 1 2 What is KNIME Analytics Platform? A tool for data analysis, manipulation, visualization, and reporting Based on the graphical
More informationText Mining Course for KNIME Analytics Platform
Text Mining Course for KNIME Analytics Platform KNIME AG Table of Contents 1. The Open Analytics Platform 2. The Text Processing Extension 3. Importing Text 4. Enrichment 5. Preprocessing 6. Transformation
More informationKNIME for the life sciences Cambridge Meetup
KNIME for the life sciences Cambridge Meetup Greg Landrum, Ph.D. KNIME.com AG 12 July 2016 What is KNIME? A bit of motivation: tool blending, data blending, documentation, automation, reproducibility More
More informationCopyright 2018 by KNIME Press
2 Copyright 2018 by KNIME Press All rights reserved. This publication is protected by copyright, and permission must be obtained from the publisher prior to any prohibited reproduction, storage in a retrieval
More informationWhat is KNIME? workflows nodes standard data mining, data analysis data manipulation
KNIME TUTORIAL What is KNIME? KNIME = Konstanz Information Miner Developed at University of Konstanz in Germany Desktop version available free of charge (Open Source) Modular platform for building and
More informationThe Top 10 New Features in KNIME 2.8. Rosaria Silipo KNIME.com AG, San Francisco
The Top 10 New Features in KNIME 2.8 Rosaria Silipo KNIME.com AG, San Francisco KNIME 2.8 KNIME 2.8 was out end of July 2013 Many New Features Documentation available at: http://tech.knime.org/whats-new-in-knime-28
More informationKNIME What s new?! Bernd Wiswedel KNIME.com AG, Zurich, Switzerland
KNIME What s new?! Bernd Wiswedel KNIME.com AG, Zurich, Switzerland Data Access ASCII (File/CSV Reader, ) Excel Web Services Remote Files (http, ftp, ) Other domain standards (e.g. Sdf) Databases Data
More informationKNIME TUTORIAL. Anna Monreale KDD-Lab, University of Pisa
KNIME TUTORIAL Anna Monreale KDD-Lab, University of Pisa Email: annam@di.unipi.it Outline Introduction on KNIME KNIME components Exercise: Data Understanding Exercise: Market Basket Analysis Exercise:
More informationIntellicus Enterprise Reporting and BI Platform
Designing Adhoc Reports Intellicus Enterprise Reporting and BI Platform Intellicus Technologies info@intellicus.com www.intellicus.com Designing Adhoc Reports i Copyright 2012 Intellicus Technologies This
More informationStyle Report Enterprise Edition
INTRODUCTION Style Report Enterprise Edition Welcome to Style Report Enterprise Edition! Style Report is a report design and interactive analysis package that allows you to explore, analyze, monitor, report,
More informationBusiness Insight Authoring
Business Insight Authoring Getting Started Guide ImageNow Version: 6.7.x Written by: Product Documentation, R&D Date: August 2016 2014 Perceptive Software. All rights reserved CaptureNow, ImageNow, Interact,
More information1 Topic. Image classification using Knime.
1 Topic Image classification using Knime. The aim of image mining is to extract valuable knowledge from image data. In the context of supervised image classification, we want to assign automatically a
More informationRosaria Silipo, Michael P. Mazanetz. The KNIME Cookbook Recipes for the Advanced User
Rosaria Silipo, Michael P. Mazanetz The KNIME Cookbook Recipes for the Advanced User 1 Copyright 2012 by KNIME Press All rights reserved. This publication is protected by copyright, and permission must
More informationData Science. Data Analyst. Data Scientist. Data Architect
Data Science Data Analyst Data Analysis in Excel Programming in R Introduction to Python/SQL/Tableau Data Visualization in R / Tableau Exploratory Data Analysis Data Scientist Inferential Statistics &
More informationBusiness Intelligence and Reporting Tools
Business Intelligence and Reporting Tools Release 1.0 Requirements Document Version 1.0 November 8, 2004 Contents Eclipse Business Intelligence and Reporting Tools Project Requirements...2 Project Overview...2
More informationAs a reference, please find a version of the Machine Learning Process described in the diagram below.
PREDICTION OVERVIEW In this experiment, two of the Project PEACH datasets will be used to predict the reaction of a user to atmospheric factors. This experiment represents the first iteration of the Machine
More informationSAS Visual Analytics 8.2: Getting Started with Reports
SAS Visual Analytics 8.2: Getting Started with Reports Introduction Reporting The SAS Visual Analytics tools give you everything you need to produce and distribute clear and compelling reports. SAS Visual
More informationSpecialist ICT Learning
Specialist ICT Learning APPLIED DATA SCIENCE AND BIG DATA ANALYTICS GTBD7 Course Description This intensive training course provides theoretical and technical aspects of Data Science and Business Analytics.
More informationTutorial on Machine Learning Tools
Tutorial on Machine Learning Tools Yanbing Xue Milos Hauskrecht Why do we need these tools? Widely deployed classical models No need to code from scratch Easy-to-use GUI Outline Matlab Apps Weka 3 UI TensorFlow
More informationEZY Intellect Pte. Ltd., #1 Changi North Street 1, Singapore
Tableau in Business Intelligence Duration: 6 Days Tableau Desktop Tableau Introduction Tableau Introduction. Overview of Tableau workbook, worksheets. Dimension & Measures Discrete and Continuous Install
More informationSitecore Experience Platform 8.0 Rev: September 13, Sitecore Experience Platform 8.0
Sitecore Experience Platform 8.0 Rev: September 13, 2018 Sitecore Experience Platform 8.0 All the official Sitecore documentation. Page 1 of 455 Experience Analytics glossary This topic contains a glossary
More informationACHIEVEMENTS FROM TRAINING
LEARN WELL TECHNOCRAFT DATA SCIENCE/ MACHINE LEARNING SYLLABUS 8TH YEAR OF ACCOMPLISHMENTS AUTHORIZED GLOBAL CERTIFICATION CENTER FOR MICROSOFT, ORACLE, IBM, AWS AND MANY MORE. 8411002339/7709292162 WWW.DW-LEARNWELL.COM
More informationACTIVE Net Insights user guide. (v5.4)
ACTIVE Net Insights user guide (v5.4) Version Date 5.4 January 23, 2018 5.3 November 28, 2017 5.2 October 24, 2017 5.1 September 26, 2017 ACTIVE Network, LLC 2017 Active Network, LLC, and/or its affiliates
More informationTableau Training Content
TABLEAU DESKTOP INTRODUCTION AND GETTING STARTED Tableau desktop role in the tableau product line Application terminology View terminology Data terminology Visual cues for fields BEST PRACTICES IN CONNECTING
More informationGOBENCH IQ Release v
GOBENCH IQ Release v1.2.3.3 2018-06-11 New Add-Ons / Features / Enhancements in GOBENCH IQ v1.2.3.3 GOBENCH IQ v1.2.3.3 contains several new features and enhancements ** New version of the comparison Excel
More informationSwitching to Sheets from Microsoft Excel Learning Center gsuite.google.com/learning-center
Switching to Sheets from Microsoft Excel 2010 Learning Center gsuite.google.com/learning-center Welcome to Sheets Now that you've switched from Microsoft Excel to G Suite, learn how to use Google Sheets
More informationRelease notes SPSS Statistics 20.0 FP1 Abstract Number Description
Release notes SPSS Statistics 20.0 FP1 Abstract This is a comprehensive list of defect corrections for the SPSS Statistics 20.0 Fix Pack 1. Details of the fixes are listed below under the tab for the respective
More informationIntroduction to Data Science. Introduction to Data Science with Python. Python Basics: Basic Syntax, Data Structures. Python Concepts (Core)
Introduction to Data Science What is Analytics and Data Science? Overview of Data Science and Analytics Why Analytics is is becoming popular now? Application of Analytics in business Analytics Vs Data
More informationTutorial Case studies
1 Topic Wrapper for feature subset selection Continuation. This tutorial is the continuation of the preceding one about the wrapper feature selection in the supervised learning context (http://data-mining-tutorials.blogspot.com/2010/03/wrapper-forfeature-selection.html).
More informationSAS Web Report Studio 3.1
SAS Web Report Studio 3.1 User s Guide SAS Documentation The correct bibliographic citation for this manual is as follows: SAS Institute Inc. 2006. SAS Web Report Studio 3.1: User s Guide. Cary, NC: SAS
More informationPrototyping Data Intensive Apps: TrendingTopics.org
Prototyping Data Intensive Apps: TrendingTopics.org Pete Skomoroch Research Scientist at LinkedIn Consultant at Data Wrangling @peteskomoroch 09/29/09 1 Talk Outline TrendingTopics Overview Wikipedia Page
More informationHow to choose the right approach to analytics and reporting
SOLUTION OVERVIEW How to choose the right approach to analytics and reporting A comprehensive comparison of the open source and commercial versions of the OpenText Analytics Suite In today s digital world,
More informationDatameer for Data Preparation:
Datameer for Data Preparation: Explore, Profile, Blend, Cleanse, Enrich, Share, Operationalize DATAMEER FOR DATA PREPARATION: EXPLORE, PROFILE, BLEND, CLEANSE, ENRICH, SHARE, OPERATIONALIZE Datameer Datameer
More informationIntellicus Enterprise Reporting and BI Platform
Working with Query Objects Intellicus Enterprise Reporting and BI Platform ` Intellicus Technologies info@intellicus.com www.intellicus.com Working with Query Objects i Copyright 2012 Intellicus Technologies
More informationData Explorer: User Guide 1. Data Explorer User Guide
Data Explorer: User Guide 1 Data Explorer User Guide Data Explorer: User Guide 2 Contents About this User Guide.. 4 System Requirements. 4 Browser Requirements... 4 Important Terminology.. 5 Getting Started
More informationSAS (Statistical Analysis Software/System)
SAS (Statistical Analysis Software/System) SAS Adv. Analytics or Predictive Modelling:- Class Room: Training Fee & Duration : 30K & 3 Months Online Training Fee & Duration : 33K & 3 Months Learning SAS:
More informationDATA 301 Introduction to Data Analytics Visualization. Dr. Ramon Lawrence University of British Columbia Okanagan
DATA 301 Introduction to Data Analytics Visualization Dr. Ramon Lawrence University of British Columbia Okanagan ramon.lawrence@ubc.ca DATA 301: Data Analytics (2) Why learn Visualization? Visualization
More informationEnterprise Miner Tutorial Notes 2 1
Enterprise Miner Tutorial Notes 2 1 ECT7110 E-Commerce Data Mining Techniques Tutorial 2 How to Join Table in Enterprise Miner e.g. we need to join the following two tables: Join1 Join 2 ID Name Gender
More informationZENworks Reporting System Reference. January 2017
ZENworks Reporting System Reference January 2017 Legal Notices For information about legal notices, trademarks, disclaimers, warranties, export and other use restrictions, U.S. Government rights, patent
More informationData Should Not be a Four Letter Word Microsoft Excel QUICK TOUR
Toolbar Tour AutoSum + more functions Chart Wizard Currency, Percent, Comma Style Increase-Decrease Decimal Name Box Chart Wizard QUICK TOUR Name Box AutoSum Numeric Style Chart Wizard Formula Bar Active
More informationUser Guide. Kronodoc Kronodoc Oy. Intelligent methods for process improvement and project execution
User Guide Kronodoc 3.0 Intelligent methods for process improvement and project execution 2003 Kronodoc Oy 2 Table of Contents 1 User Guide 5 2 Information Structure in Kronodoc 6 3 Entering and Exiting
More informationIntroduction to BEST Viewpoints
Introduction to BEST Viewpoints This is not all but just one of the documentation files included in BEST Viewpoints. Introduction BEST Viewpoints is a user friendly data manipulation and analysis application
More informationWhat s New In Sawmill 8 Why Should I Upgrade To Sawmill 8?
What s New In Sawmill 8 Why Should I Upgrade To Sawmill 8? Sawmill 8 is a major new version of Sawmill, the result of several years of development. Nearly every aspect of Sawmill has been enhanced, and
More informationBEAWebLogic. Portal. Overview
BEAWebLogic Portal Overview Version 10.2 Revised: February 2008 Contents About the BEA WebLogic Portal Documentation Introduction to WebLogic Portal Portal Concepts.........................................................2-2
More informationData Science Essentials
Data Science Essentials Lab 6 Introduction to Machine Learning Overview In this lab, you will use Azure Machine Learning to train, evaluate, and publish a classification model, a regression model, and
More informationalteryx training courses
alteryx training courses alteryx designer 2 day course This course covers Alteryx Designer for new and intermediate Alteryx users. It introduces the User Interface and works through core Alteryx capability,
More informationSAS Visual Analytics 8.2: Working with Report Content
SAS Visual Analytics 8.2: Working with Report Content About Objects After selecting your data source and data items, add one or more objects to display the results. SAS Visual Analytics provides objects
More informationDesigning Ad hoc Report. Version: 7.3
Designing Ad hoc Report Version: 7.3 Copyright 2015 Intellicus Technologies This document and its content is copyrighted material of Intellicus Technologies. The content may not be copied or derived from,
More informationImagine. Create. Discover. User Manual. TopLine Results Corporation
Imagine. Create. Discover. User Manual TopLine Results Corporation 2008-2009 Created: Tuesday, March 17, 2009 Table of Contents 1 Welcome 1 Features 2 2 Installation 4 System Requirements 5 Obtaining Installation
More informationApplied Machine Learning
Applied Machine Learning Lab 3 Working with Text Data Overview In this lab, you will use R or Python to work with text data. Specifically, you will use code to clean text, remove stop words, and apply
More informationExport out report results in multiple formats like PDF, Excel, Print, , etc.
Edition Comparison DOCSVAULT Docsvault is full of features that can help small businesses and large enterprises go paperless. The feature matrix below displays Docsvault s abilities for its Enterprise
More informationEdition 3.2. Tripolis Solutions Dialogue Manual version 3.2 2
Edition 3.2 Tripolis Solutions Dialogue Manual version 3.2 2 Table of Content DIALOGUE SETUP... 7 Introduction... 8 Process flow... 9 USER SETTINGS... 10 Language, Name and Email address settings... 10
More informationInstructor : Dr. Sunnie Chung. Independent Study Spring Pentaho. 1 P a g e
ABSTRACT Pentaho Business Analytics from different data source, Analytics from csv/sql,create Star Schema Fact & Dimension Tables, kettle transformation for big data integration, MongoDB kettle Transformation,
More informationUser Guide. Copyright Wordfast, LLC All rights reserved.
User Guide All rights reserved. Table of Contents About this Guide... 7 Conventions...7 Typographical... 7 Icons... 7 1 Release Notes Summary... 8 New Features and Improvements... 8 Fixed Issues... 8 2
More informationSpotfire: Brisbane Breakfast & Learn. Thursday, 9 November 2017
Spotfire: Brisbane Breakfast & Learn Thursday, 9 November 2017 CONFIDENTIALITY The following information is confidential information of TIBCO Software Inc. Use, duplication, transmission, or republication
More informationTalend Data Preparation Free Desktop. Getting Started Guide V2.1
Talend Data Free Desktop Getting Guide V2.1 1 Talend Data Training Getting Guide To navigate to a specific location within this guide, click one of the boxes below. Overview of Data Access Data And Getting
More informationMicroStrategy Desktop
MicroStrategy Desktop Quick Start Guide MicroStrategy Desktop is designed to enable business professionals like you to explore data, simply and without needing direct support from IT. 1 Import data from
More informationQuick Start Guide. Version R94. English
Custom Reports Quick Start Guide Version R94 English December 12, 2016 Copyright Agreement The purchase and use of all Software and Services is subject to the Agreement as defined in Kaseya s Click-Accept
More informationLogi Info v12.5 WHAT S NEW
Logi Info v12.5 WHAT S NEW Introduction Logi empowers companies to embed analytics into the fabric of their organizations and products enabling anyone to analyze data, share insights, and make informed
More informationIntegration Service. Admin Console User Guide. On-Premises
Kony MobileFabric TM Integration Service Admin Console User Guide On-Premises Release 7.3 Document Relevance and Accuracy This document is considered relevant to the Release stated on this title page and
More information1 Dulcian, Inc., 2001 All rights reserved. Oracle9i Data Warehouse Review. Agenda
Agenda Oracle9i Warehouse Review Dulcian, Inc. Oracle9i Server OLAP Server Analytical SQL Mining ETL Infrastructure 9i Warehouse Builder Oracle 9i Server Overview E-Business Intelligence Platform 9i Server:
More informationDesigning Ad hoc Reports. Version: 16.0
Designing Ad hoc Reports Version: 16.0 Copyright 2017 Intellicus Technologies This document and its content is copyrighted material of Intellicus Technologies. The content may not be copied or derived
More informationWeb Analysis in 4 Easy Steps. Rosaria Silipo, Bernd Wiswedel and Tobias Kötter
Web Analysis in 4 Easy Steps Rosaria Silipo, Bernd Wiswedel and Tobias Kötter KNIME Forum Analysis KNIME Forum Analysis Steps: 1. Get data into KNIME 2. Extract simple statistics (how many posts, response
More informationThe following topics describe how to work with reports in the Firepower System:
The following topics describe how to work with reports in the Firepower System: Introduction to Reports Introduction to Reports, on page 1 Risk Reports, on page 1 Standard Reports, on page 2 About Working
More informationKNIME Enalos+ Molecular Descriptor nodes
KNIME Enalos+ Molecular Descriptor nodes A Brief Tutorial Novamechanics Ltd Contact: info@novamechanics.com Version 1, June 2017 Table of Contents Introduction... 1 Step 1-Workbench overview... 1 Step
More informationMicroStrategy Academic Program
MicroStrategy Academic Program Creating a center of excellence for enterprise analytics and mobility. DATA PREPARATION: HOW TO WRANGLE, ENRICH, AND PROFILE DATA APPROXIMATE TIME NEEDED: 1 HOUR TABLE OF
More informationFrom Insight to Action: Analytics from Both Sides of the Brain. Vaz Balasingham Director of Solutions Consulting
From Insight to Action: Analytics from Both Sides of the Brain Vaz Balasingham Director of Solutions Consulting vbalasin@tibco.com Insight to Action from Both Sides of the Brain Value Grow Revenue Reduce
More informationPython Certification Training
Introduction To Python Python Certification Training Goal : Give brief idea of what Python is and touch on basics. Define Python Know why Python is popular Setup Python environment Discuss flow control
More informationSpotfire Data Science with Hadoop Using Spotfire Data Science to Operationalize Data Science in the Age of Big Data
Spotfire Data Science with Hadoop Using Spotfire Data Science to Operationalize Data Science in the Age of Big Data THE RISE OF BIG DATA BIG DATA: A REVOLUTION IN ACCESS Large-scale data sets are nothing
More informationADVANCED ANALYTICS USING SAS ENTERPRISE MINER RENS FEENSTRA
INSIGHTS@SAS: ADVANCED ANALYTICS USING SAS ENTERPRISE MINER RENS FEENSTRA AGENDA 09.00 09.15 Intro 09.15 10.30 Analytics using SAS Enterprise Guide Ellen Lokollo 10.45 12.00 Advanced Analytics using SAS
More informationSmartView. User Guide - Analysis. Version 2.0
SmartView User Guide - Analysis Version 2.0 Table of Contents Page i Table of Contents Table Of Contents I Introduction 1 Dashboard Layouts 2 Dashboard Mode 2 Story Mode 3 Dashboard Controls 4 Dashboards
More informationUser Guide. Copyright Wordfast, LLC All rights reserved.
User Guide All rights reserved. Table of Contents Summary... 7 New Features and Improvements... 7 Fixed Issues... 8 1 About Pro... 9 Key Advantages... 9 2 Get Started... 10 Requirements... 10 Install and
More informationEnd-to-End data mining feature integration, transformation and selection with Datameer Datameer, Inc. All rights reserved.
End-to-End data mining feature integration, transformation and selection with Datameer Fastest time to Insights Rapid Data Integration Zero coding data integration Wizard-led data integration & No ETL
More informationHarePoint Analytics. For SharePoint. User Manual
HarePoint Analytics For SharePoint User Manual HarePoint Analytics for SharePoint 2013 product version: 15.5 HarePoint Analytics for SharePoint 2016 product version: 16.0 04/27/2017 2 Introduction HarePoint.Com
More informationIntermediate Microsoft Excel 2010
P a g e 1 Intermediate Microsoft Excel 2010 ABOUT THIS CLASS This class is designed to continue where the Microsoft Excel 2010 Basics class left off. Specifically, we will cover additional ways to organize
More informationDATA SCIENCE INTRODUCTION QSHORE TECHNOLOGIES. About the Course:
DATA SCIENCE About the Course: In this course you will get an introduction to the main tools and ideas which are required for Data Scientist/Business Analyst/Data Analyst/Analytics Manager/Actuarial Scientist/Business
More informationExcel Shortcuts Increasing YOUR Productivity
Excel Shortcuts Increasing YOUR Productivity CompuHELP Division of Tommy Harrington Enterprises, Inc. tommy@tommyharrington.com https://www.facebook.com/tommyharringtonextremeexcel Excel Shortcuts Increasing
More informationSAS Data Explorer 2.1: User s Guide
SAS Data Explorer 2.1: User s Guide Working with SAS Data Explorer Understanding SAS Data Explorer SAS Data Explorer and the Choose Data Window SAS Data Explorer enables you to copy data to memory on SAS
More informationExpedient User Manual Getting Started
Volume 1 Expedient User Manual Getting Started Gavin Millman & Associates Pty Ltd 281 Buckley Street Essendon VIC 3040 Phone 03 9331 3944 Web www.expedientsoftware.com.au Table of Contents Logging In...
More informationRelease notes for version 3.7.2
Release notes for version 3.7.2 Important! Create a backup copy of your projects before updating to the new version. Projects saved in the new version can t be opened in versions earlier than 3.7. Breaking
More informationWINDEV 23 - WEBDEV 23 - WINDEV Mobile 23 Documentation version
WINDEV 23 - WEBDEV 23 - WINDEV Mobile 23 Documentation version 23-1 - 04-18 Summary Part 1 - Report editor 1. Introduction... 13 2. How to create a report... 23 3. Data sources of a report... 43 4. Describing
More informationLavastorm Analytic Library Predictive and Statistical Analytics Node Pack FAQs
1.1 Introduction Lavastorm Analytic Library Predictive and Statistical Analytics Node Pack FAQs For brevity, the Lavastorm Analytics Library (LAL) Predictive and Statistical Analytics Node Pack will be
More informationEnterprise Miner Version 4.0. Changes and Enhancements
Enterprise Miner Version 4.0 Changes and Enhancements Table of Contents General Information.................................................................. 1 Upgrading Previous Version Enterprise Miner
More informationTeiid Designer User Guide 7.5.0
Teiid Designer User Guide 1 7.5.0 1. Introduction... 1 1.1. What is Teiid Designer?... 1 1.2. Why Use Teiid Designer?... 2 1.3. Metadata Overview... 2 1.3.1. What is Metadata... 2 1.3.2. Editing Metadata
More informationUser Guide. Copyright Wordfast, LLC All rights reserved.
User Guide All rights reserved. Table of Contents About this Guide... 7 Conventions...7 Typographical... 7 Icons... 7 1 Release Notes Summary... 8 New Features and Improvements... 8 Fixed Issues... 8 Known
More informationAbstract. For notes detailing the changes in each release, see the MySQL for Excel Release Notes. For legal information, see the Legal Notices.
MySQL for Excel Abstract This is the MySQL for Excel Reference Manual. It documents MySQL for Excel 1.3 through 1.3.7. Much of the documentation also applies to the previous 1.2 series. For notes detailing
More informationSpreadsheet definition: Starting a New Excel Worksheet: Navigating Through an Excel Worksheet
Copyright 1 99 Spreadsheet definition: A spreadsheet stores and manipulates data that lends itself to being stored in a table type format (e.g. Accounts, Science Experiments, Mathematical Trends, Statistics,
More informationRelease notes for version 3.9.2
Release notes for version 3.9.2 What s new Overview Here is what we were focused on while developing version 3.9.2, and a few announcements: Continuing improving ETL capabilities of EasyMorph by adding
More informationIntroduction to Data Science
Introduction to Data Science Lab 4 Introduction to Machine Learning Overview In the previous labs, you explored a dataset containing details of lemonade sales. In this lab, you will use machine learning
More informationSPSS Statistics Patch Description
SPSS Statistics 18.0.1 Patch Description Product: SPSS Statistics 18.0.1 Date: December 8, 2009 Description: This patch resolves the following issues: 1. A problem with reading large SAS data files was
More informationRandom Forest A. Fornaser
Random Forest A. Fornaser alberto.fornaser@unitn.it Sources Lecture 15: decision trees, information theory and random forests, Dr. Richard E. Turner Trees and Random Forests, Adele Cutler, Utah State University
More informationSAS Factory Miner 14.2: User s Guide
SAS Factory Miner 14.2: User s Guide SAS Documentation The correct bibliographic citation for this manual is as follows: SAS Institute Inc. 2016. SAS Factory Miner 14.2: User s Guide. Cary, NC: SAS Institute
More informationManipulating Database Objects
Manipulating Database Objects Purpose This tutorial shows you how to manipulate database objects using Oracle Application Express. Time to Complete Approximately 30 minutes. Topics This tutorial covers
More informationEmbarcadero PowerSQL 1.1 Evaluation Guide. Published: July 14, 2008
Embarcadero PowerSQL 1.1 Evaluation Guide Published: July 14, 2008 Contents INTRODUCTION TO POWERSQL... 3 Product Benefits... 3 Product Benefits... 3 Product Benefits... 3 ABOUT THIS EVALUATION GUIDE...
More informationTackling Big Data Using MATLAB
Tackling Big Data Using MATLAB Alka Nair Application Engineer 2015 The MathWorks, Inc. 1 Building Machine Learning Models with Big Data Access Preprocess, Exploration & Model Development Scale up & Integrate
More informationTUTORIAL. HCS- Tools + Scripting Integrations
TUTORIAL HCS- Tools + Scripting Integrations HCS- Tools... 3 Setup... 3 Task and Data... 4 1) Data Input Opera Reader... 7 2) Meta data integration Expand barcode... 8 3) Meta data integration Join Layout...
More informationVisual Workflow Implementation Guide
Version 30.0: Spring 14 Visual Workflow Implementation Guide Note: Any unreleased services or features referenced in this or other press releases or public statements are not currently available and may
More informationEnterprise Data Catalog for Microsoft Azure Tutorial
Enterprise Data Catalog for Microsoft Azure Tutorial VERSION 10.2 JANUARY 2018 Page 1 of 45 Contents Tutorial Objectives... 4 Enterprise Data Catalog Overview... 5 Overview... 5 Objectives... 5 Enterprise
More information