The key challenge in implementing this well-known model using RapidMiner is in preparing the dataset. The following implementation uses Keras extension operators and can be run on general-purpose machines 15. 14 The modeling and execution of the deep-learning process in production application requires machines running GPUs as computation using normal machines can be time consuming. The Keras extension requires Python and related libraries to be installed. The Keras extension in RapidMiner enables a top-level, visual, deep-learning process along with data science preprocessing and postprocessing. Keras is designed to run on top of popular deep learning frameworks like TensorFlow and Microsoft Cognitive Toolkit. It utilizes the Keras neural network library for Python. The Keras extension for RapidMiner offers a set of operators specific for deep learning. This operator lacks the variety of different layer designs that distinguishes deep-learning architecture from simple Neural Networks. By default, the layer configuration is dense. The operator parameter can be configured to include multiple hidden layers and nodes within each layer. The simple artificial neural networks with multiple hidden layers can be implemented by the Neural Net operator introduced in Chapter 4, Classification, Artificial Neural Network. Vijay Kotu, Bala Deshpande, in Data Science (Second Edition), 2019 10.3 How to Implementĭeep-learning architecture in RapidMiner can be implemented by a couple of different paths. For initial trials, however, it is best to go with the default settings. Most real-world datasets are not cleanly separable and, therefore, will require the use of this factor. There is, however, one parameter that is critical in optimizing SVM performance: this is the SVM complexity constant, C, which sets the penalties for misclassification, as was described in an earlier section. There are many different parameters that can be adjusted depending on the type of kernel function that is chosen. (Optimization using RapidMiner is described in Chapter 15: Getting started with RapidMiner.) Parameter Settings The solution is to nest the SVM within an Optimization operator and explore a host of different kernel types and kernel parameters until one is found that performs reasonably well. Unfortunately, with more realistic datasets, there is no way of knowing beforehand which kernel type would work best. The point of this exercise was to demonstrate the flexibility of SVMs and the ease of performing such trials using RapidMiner. Classifying the two-ring nonlinear problem using a polynomial SVM kernel. Also, this search shows the hierarchy of where the operators exist, which helps one learn where they are.įigure 4.58. Similarly try “principal” and the operator for principal component analysis can be seen, if there is uncertainty about the correct and complete operator name or where to look initially. Using search is a quick way to navigate to the operators if part of their name is known. For example, to see if there is an operator to handle CSV files, type “CSV” in the search field and both Read and Write will show up. ![]() The search box at the top of the operator window is also useful-if one knows even part of the operator name then it’s easy to find out if RapidMiner provides such an operator. Then the Read CSV operator would need to be configured by clicking on the Import Configuration Wizard, which will provide a sequence of steps to follow to read the data in 2. 15.7, to simply connect to data in a CSV file on disk using a Read CSV operator, drag and drop the operator to the main process window. Either way, RapidMiner offers easy-to-follow wizards that will guide through the steps. One could choose to simply connect to their data (which is stored in a specific location on disk) or to import the data set into the local RapidMiner repository itself so that it becomes available for any process within the repository and every time RapidMiner is opened, this data set is available for retrieval. If the data is in a database, then at least a basic understanding of databases, database connections and queries is essential in order to use the operator properly. The data can be stored in a flat file such as a comma-separated values (CSV) file or spreadsheet, in a database such as a Microsoft SQLServer table, or it can be stored in other proprietary formats such as SAS or Stata or SPSS, etc. RapidMiner offers dozens of different operators or ways to connect to data.
0 Comments
Leave a Reply. |