{"id":3864,"date":"2018-07-19T17:26:43","date_gmt":"2018-07-19T17:26:43","guid":{"rendered":"https:\/\/max-drake.cc\/?p=3864"},"modified":"2018-07-22T14:50:25","modified_gmt":"2018-07-22T14:50:25","slug":"weka-2-weka-machine-learning-alternative-interfaces-experimenter-knowledgeflow-command-line","status":"publish","type":"post","link":"https:\/\/max-drake.cc\/?p=3864","title":{"rendered":"Weka 2. Weka Machine Learning &#8220;Explorer&#8221; alternative interfaces &#8220;Experimenter&#8221;, &#8220;Knowledge Flow&#8221; &#038; &#8220;Command Line&#8221;."},"content":{"rendered":"<p>Following on from the first Weka post, which was based on information gleaned from the <a href=\"https:\/\/www.futurelearn.com\/courses\/data-mining-with-weka\"><strong>Data Mining with Weka<\/strong><\/a> course that I followed.<\/p>\n<p>This post is based on the following <a href=\"https:\/\/www.youtube.com\/playlist?list=PLm4W7_iX_v4OMSgc8xowC2h70s-unJKCp\" target=\"_blank\" rel=\"noopener\"><strong>More Data Mining with Weka<\/strong><\/a> videos.<\/p>\n<p>Some of&nbsp; the screenshots below from the video&#8217;s that have been developed and are presented by Ian Witten of <a href=\"https:\/\/www.cs.waikato.ac.nz\/ml\/\" target=\"_blank\" rel=\"noopener\">Machine Learning Group&nbsp; University of Waikato<\/a> in NZ. I have not got his permission to use them but cite him as the source of this information.<\/p>\n<h3>Earlier lesson takeaways<\/h3>\n<p>Training Set &amp; Test Set should be independent. Otherwise you cannot realistically test your classifier to see if it works if you are only using the training data set.<\/p>\n<p><img decoding=\"async\" class=\"alignnone wp-image-3880 lazyload\" data-src=\"https:\/\/max-drake.cc\/wp-content\/uploads\/2018\/07\/im15-1024x585.jpg\" alt=\"\" width=\"1498\" height=\"855\" data-srcset=\"https:\/\/max-drake.cc\/wp-content\/uploads\/2018\/07\/im15-1024x585.jpg 1024w, https:\/\/max-drake.cc\/wp-content\/uploads\/2018\/07\/im15-300x171.jpg 300w, https:\/\/max-drake.cc\/wp-content\/uploads\/2018\/07\/im15-768x439.jpg 768w\" data-sizes=\"(max-width: 1498px) 100vw, 1498px\" src=\"data:image\/svg+xml;base64,PHN2ZyB3aWR0aD0iMSIgaGVpZ2h0PSIxIiB4bWxucz0iaHR0cDovL3d3dy53My5vcmcvMjAwMC9zdmciPjwvc3ZnPg==\" style=\"--smush-placeholder-width: 1498px; --smush-placeholder-aspect-ratio: 1498\/855;\" \/><\/p>\n<p>Statistical Mean, variance &amp; standard deviation formulae.<\/p>\n<p><img decoding=\"async\" class=\"wp-image-3879 aligncenter lazyload\" data-src=\"https:\/\/max-drake.cc\/wp-content\/uploads\/2018\/07\/im16-1024x567.jpg\" alt=\"\" width=\"1457\" height=\"806\" data-srcset=\"https:\/\/max-drake.cc\/wp-content\/uploads\/2018\/07\/im16-1024x567.jpg 1024w, https:\/\/max-drake.cc\/wp-content\/uploads\/2018\/07\/im16-300x166.jpg 300w, https:\/\/max-drake.cc\/wp-content\/uploads\/2018\/07\/im16-768x425.jpg 768w\" data-sizes=\"(max-width: 1457px) 100vw, 1457px\" src=\"data:image\/svg+xml;base64,PHN2ZyB3aWR0aD0iMSIgaGVpZ2h0PSIxIiB4bWxucz0iaHR0cDovL3d3dy53My5vcmcvMjAwMC9zdmciPjwvc3ZnPg==\" style=\"--smush-placeholder-width: 1457px; --smush-placeholder-aspect-ratio: 1457\/806;\" \/><\/p>\n<p>The 10-fold cross-validation method breaking the data set into 10 equal pieces and using 9 pieces to train and the 10th to test, then redoing the process with the test added into the training set and a different piece as the test set. A longer iterative process but you can cross- validate the information. There are subtleties about how you break the data set up into equal sized sets based on a consistency of data in each set ( so one set not all no&#8217;s and another not all yes&#8217;s etc, try for a balance).<\/p>\n<p><img decoding=\"async\" class=\"alignnone wp-image-3878 lazyload\" data-src=\"https:\/\/max-drake.cc\/wp-content\/uploads\/2018\/07\/im17-1024x588.jpg\" alt=\"\" width=\"1471\" height=\"845\" data-srcset=\"https:\/\/max-drake.cc\/wp-content\/uploads\/2018\/07\/im17-1024x588.jpg 1024w, https:\/\/max-drake.cc\/wp-content\/uploads\/2018\/07\/im17-300x172.jpg 300w, https:\/\/max-drake.cc\/wp-content\/uploads\/2018\/07\/im17-768x441.jpg 768w\" data-sizes=\"(max-width: 1471px) 100vw, 1471px\" src=\"data:image\/svg+xml;base64,PHN2ZyB3aWR0aD0iMSIgaGVpZ2h0PSIxIiB4bWxucz0iaHR0cDovL3d3dy53My5vcmcvMjAwMC9zdmciPjwvc3ZnPg==\" style=\"--smush-placeholder-width: 1471px; --smush-placeholder-aspect-ratio: 1471\/845;\" \/><\/p>\n<h2>Experimenter<\/h2>\n<p>Mostly based on <a href=\"https:\/\/www.youtube.com\/watch?v=IzFA4l0JRk4&amp;list=PLm4W7_iX_v4OMSgc8xowC2h70s-unJKCp&amp;index=4\" target=\"_blank\" rel=\"noopener\">More Data Mining with Weka (1.3: Comparing classifiers)<\/a>.<\/p>\n<p>The experimenter allows you to run multiple machine learning algorithms (or variations of that algorithm) over multiple data sets. It has 3 tabs, the setup tap, the run tab and the analyse tab. You can save setup configurations to a file and you can save results to a file (eg a .arff or .csv file).<\/p>\n<p>Select a NEW Experiment configuration (item 4 below).<\/p>\n<p>In the <strong>setup tab<\/strong> you can choose data files to include, and you can choose multiple data sets.(left hand bottom tab).<\/p>\n<p>You can choose , in right hand bottom tab the Algorithms that you want to test, and also in what order you want to run them.<\/p>\n<p>You can choose Experiment type and number of iterations too.<\/p>\n<p>&nbsp;<\/p>\n<p><img decoding=\"async\" class=\"wp-image-3868 aligncenter lazyload\" data-src=\"https:\/\/max-drake.cc\/wp-content\/uploads\/2018\/07\/im18-1-1024x549.jpg\" alt=\"\" width=\"1494\" height=\"800\" data-srcset=\"https:\/\/max-drake.cc\/wp-content\/uploads\/2018\/07\/im18-1-1024x549.jpg 1024w, https:\/\/max-drake.cc\/wp-content\/uploads\/2018\/07\/im18-1-300x161.jpg 300w, https:\/\/max-drake.cc\/wp-content\/uploads\/2018\/07\/im18-1-768x412.jpg 768w, https:\/\/max-drake.cc\/wp-content\/uploads\/2018\/07\/im18-1.jpg 1591w\" data-sizes=\"(max-width: 1494px) 100vw, 1494px\" src=\"data:image\/svg+xml;base64,PHN2ZyB3aWR0aD0iMSIgaGVpZ2h0PSIxIiB4bWxucz0iaHR0cDovL3d3dy53My5vcmcvMjAwMC9zdmciPjwvc3ZnPg==\" style=\"--smush-placeholder-width: 1494px; --smush-placeholder-aspect-ratio: 1494\/800;\" \/><\/p>\n<p>Once you have your setup as you wish, you then go to the <strong>run tab<\/strong> and hit start and it will run the configurations you have in setup. Then go to the Analyse tab.<\/p>\n<p><img decoding=\"async\" class=\"wp-image-3867 aligncenter lazyload\" data-src=\"https:\/\/max-drake.cc\/wp-content\/uploads\/2018\/07\/im23-1-1024x799.jpg\" alt=\"\" width=\"1498\" height=\"1169\" data-srcset=\"https:\/\/max-drake.cc\/wp-content\/uploads\/2018\/07\/im23-1-1024x799.jpg 1024w, https:\/\/max-drake.cc\/wp-content\/uploads\/2018\/07\/im23-1-300x234.jpg 300w, https:\/\/max-drake.cc\/wp-content\/uploads\/2018\/07\/im23-1-768x599.jpg 768w, https:\/\/max-drake.cc\/wp-content\/uploads\/2018\/07\/im23-1.jpg 1109w\" data-sizes=\"(max-width: 1498px) 100vw, 1498px\" src=\"data:image\/svg+xml;base64,PHN2ZyB3aWR0aD0iMSIgaGVpZ2h0PSIxIiB4bWxucz0iaHR0cDovL3d3dy53My5vcmcvMjAwMC9zdmciPjwvc3ZnPg==\" style=\"--smush-placeholder-width: 1498px; --smush-placeholder-aspect-ratio: 1498\/1169;\" \/><\/p>\n<p>In the <strong>Analyse Tab<\/strong> hit the EXPERIMENT button (top right) then the PERFORM TEST button to get the results. oin the Test Output panel (bottom right).<\/p>\n<p>In this setup the results are compared with the trees.J48 algorithm, as it was the first. In the left hand panel we are putting a significance of 0.05 (5%). This is then comparing the other 2 methods we have selected against the trees.J48 algorithm to see which performs better and shows whether it is within the significance % we have chosen, if not it shows an &#8220;*&#8221; to say that it has performed worse. So, if you look at the table below, for the Iris dataset, the rules.ZeroR has only a 33.33% accuracy of prediction compared to the trees.J48 algorithm which has a 94.73% accuracy on the training set. So it is significantly worse , hwence the &#8220;*&#8221;. Whereas the rules.OneR is 92.53% accurate, so falls within the selected 5% significance (so may perform better on a different training set than the trees.J48 algorithm).<\/p>\n<p><img decoding=\"async\" class=\"wp-image-3874 aligncenter lazyload\" data-src=\"https:\/\/max-drake.cc\/wp-content\/uploads\/2018\/07\/im22-1.jpg\" alt=\"\" width=\"1458\" height=\"1110\" data-srcset=\"https:\/\/max-drake.cc\/wp-content\/uploads\/2018\/07\/im22-1.jpg 993w, https:\/\/max-drake.cc\/wp-content\/uploads\/2018\/07\/im22-1-300x228.jpg 300w, https:\/\/max-drake.cc\/wp-content\/uploads\/2018\/07\/im22-1-768x585.jpg 768w\" data-sizes=\"(max-width: 1458px) 100vw, 1458px\" src=\"data:image\/svg+xml;base64,PHN2ZyB3aWR0aD0iMSIgaGVpZ2h0PSIxIiB4bWxucz0iaHR0cDovL3d3dy53My5vcmcvMjAwMC9zdmciPjwvc3ZnPg==\" style=\"--smush-placeholder-width: 1458px; --smush-placeholder-aspect-ratio: 1458\/1110;\" \/><\/p>\n<p>In the example above, the results&nbsp; are comparing the rules.ZeroR and rules.OneR against the trees.J48 algorithm. If we want to compare against the rules.OneR we can change the setup as below.<\/p>\n<p><img decoding=\"async\" class=\"wp-image-3873 aligncenter lazyload\" data-src=\"https:\/\/max-drake.cc\/wp-content\/uploads\/2018\/07\/im24-1.jpg\" alt=\"\" width=\"1534\" height=\"1500\" data-srcset=\"https:\/\/max-drake.cc\/wp-content\/uploads\/2018\/07\/im24-1.jpg 979w, https:\/\/max-drake.cc\/wp-content\/uploads\/2018\/07\/im24-1-300x294.jpg 300w, https:\/\/max-drake.cc\/wp-content\/uploads\/2018\/07\/im24-1-768x752.jpg 768w\" data-sizes=\"(max-width: 1534px) 100vw, 1534px\" src=\"data:image\/svg+xml;base64,PHN2ZyB3aWR0aD0iMSIgaGVpZ2h0PSIxIiB4bWxucz0iaHR0cDovL3d3dy53My5vcmcvMjAwMC9zdmciPjwvc3ZnPg==\" style=\"--smush-placeholder-width: 1534px; --smush-placeholder-aspect-ratio: 1534\/1500;\" \/><\/p>\n<p><img decoding=\"async\" class=\"wp-image-3872 aligncenter lazyload\" data-src=\"https:\/\/max-drake.cc\/wp-content\/uploads\/2018\/07\/im25-1.jpg\" alt=\"\" width=\"1458\" height=\"1086\" data-srcset=\"https:\/\/max-drake.cc\/wp-content\/uploads\/2018\/07\/im25-1.jpg 1000w, https:\/\/max-drake.cc\/wp-content\/uploads\/2018\/07\/im25-1-300x224.jpg 300w, https:\/\/max-drake.cc\/wp-content\/uploads\/2018\/07\/im25-1-768x572.jpg 768w\" data-sizes=\"(max-width: 1458px) 100vw, 1458px\" src=\"data:image\/svg+xml;base64,PHN2ZyB3aWR0aD0iMSIgaGVpZ2h0PSIxIiB4bWxucz0iaHR0cDovL3d3dy53My5vcmcvMjAwMC9zdmciPjwvc3ZnPg==\" style=\"--smush-placeholder-width: 1458px; --smush-placeholder-aspect-ratio: 1458\/1086;\" \/><\/p>\n<h2>Also if we wanted to change the row\/column output to have the rows as the different algorithms and the columns as the datasets then we can change the configuration by changing the rows from &#8220;Dataset&#8221; to &#8220;Scheme&#8221; and the Cols from &nbsp;&nbsp; &#8220;Scheme&#8221; to &#8220;Dataset&#8221;.<img decoding=\"async\" class=\"wp-image-3870 aligncenter lazyload\" data-src=\"https:\/\/max-drake.cc\/wp-content\/uploads\/2018\/07\/im27-1-1024x519.jpg\" alt=\"\" width=\"1527\" height=\"775\" data-srcset=\"https:\/\/max-drake.cc\/wp-content\/uploads\/2018\/07\/im27-1-1024x519.jpg 1024w, https:\/\/max-drake.cc\/wp-content\/uploads\/2018\/07\/im27-1-300x152.jpg 300w, https:\/\/max-drake.cc\/wp-content\/uploads\/2018\/07\/im27-1-768x389.jpg 768w, https:\/\/max-drake.cc\/wp-content\/uploads\/2018\/07\/im27-1.jpg 1669w\" data-sizes=\"(max-width: 1527px) 100vw, 1527px\" src=\"data:image\/svg+xml;base64,PHN2ZyB3aWR0aD0iMSIgaGVpZ2h0PSIxIiB4bWxucz0iaHR0cDovL3d3dy53My5vcmcvMjAwMC9zdmciPjwvc3ZnPg==\" style=\"--smush-placeholder-width: 1527px; --smush-placeholder-aspect-ratio: 1527\/775;\" \/><\/h2>\n<h2>Knowledge Flow Interface<\/h2>\n<p>This is similar to the Knime Node process. You set up a workflow then plug in a dataset and run it and visualise the results.<\/p>\n<p><img decoding=\"async\" class=\"wp-image-3891 aligncenter lazyload\" data-src=\"https:\/\/max-drake.cc\/wp-content\/uploads\/2018\/07\/im30-1-934x1024.jpg\" alt=\"\" width=\"1476\" height=\"1618\" data-srcset=\"https:\/\/max-drake.cc\/wp-content\/uploads\/2018\/07\/im30-1-934x1024.jpg 934w, https:\/\/max-drake.cc\/wp-content\/uploads\/2018\/07\/im30-1-274x300.jpg 274w, https:\/\/max-drake.cc\/wp-content\/uploads\/2018\/07\/im30-1-768x842.jpg 768w, https:\/\/max-drake.cc\/wp-content\/uploads\/2018\/07\/im30-1.jpg 1016w\" data-sizes=\"(max-width: 1476px) 100vw, 1476px\" src=\"data:image\/svg+xml;base64,PHN2ZyB3aWR0aD0iMSIgaGVpZ2h0PSIxIiB4bWxucz0iaHR0cDovL3d3dy53My5vcmcvMjAwMC9zdmciPjwvc3ZnPg==\" style=\"--smush-placeholder-width: 1476px; --smush-placeholder-aspect-ratio: 1476\/1618;\" \/><\/p>\n<p>Select a node from the list and click in the blank area to paste it there.<\/p>\n<p><img decoding=\"async\" class=\"wp-image-3890 aligncenter lazyload\" data-src=\"https:\/\/max-drake.cc\/wp-content\/uploads\/2018\/07\/im31-1.jpg\" alt=\"\" width=\"1459\" height=\"1104\" data-srcset=\"https:\/\/max-drake.cc\/wp-content\/uploads\/2018\/07\/im31-1.jpg 997w, https:\/\/max-drake.cc\/wp-content\/uploads\/2018\/07\/im31-1-300x227.jpg 300w, https:\/\/max-drake.cc\/wp-content\/uploads\/2018\/07\/im31-1-768x582.jpg 768w\" data-sizes=\"(max-width: 1459px) 100vw, 1459px\" src=\"data:image\/svg+xml;base64,PHN2ZyB3aWR0aD0iMSIgaGVpZ2h0PSIxIiB4bWxucz0iaHR0cDovL3d3dy53My5vcmcvMjAwMC9zdmciPjwvc3ZnPg==\" style=\"--smush-placeholder-width: 1459px; --smush-placeholder-aspect-ratio: 1459\/1104;\" \/><\/p>\n<p>Right click on the node and choose configure to set up the node.<\/p>\n<p><img decoding=\"async\" class=\"wp-image-3889 aligncenter lazyload\" data-src=\"https:\/\/max-drake.cc\/wp-content\/uploads\/2018\/07\/im32-1-1024x854.jpg\" alt=\"\" width=\"1499\" height=\"1249\" data-srcset=\"https:\/\/max-drake.cc\/wp-content\/uploads\/2018\/07\/im32-1-1024x854.jpg 1024w, https:\/\/max-drake.cc\/wp-content\/uploads\/2018\/07\/im32-1-300x250.jpg 300w, https:\/\/max-drake.cc\/wp-content\/uploads\/2018\/07\/im32-1-768x640.jpg 768w, https:\/\/max-drake.cc\/wp-content\/uploads\/2018\/07\/im32-1.jpg 1299w\" data-sizes=\"(max-width: 1499px) 100vw, 1499px\" src=\"data:image\/svg+xml;base64,PHN2ZyB3aWR0aD0iMSIgaGVpZ2h0PSIxIiB4bWxucz0iaHR0cDovL3d3dy53My5vcmcvMjAwMC9zdmciPjwvc3ZnPg==\" style=\"--smush-placeholder-width: 1499px; --smush-placeholder-aspect-ratio: 1499\/1249;\" \/><\/p>\n<p>Add another node and configure it.<\/p>\n<p><img decoding=\"async\" class=\"wp-image-3888 aligncenter lazyload\" data-src=\"https:\/\/max-drake.cc\/wp-content\/uploads\/2018\/07\/im33-1.jpg\" alt=\"\" width=\"1473\" height=\"1121\" data-srcset=\"https:\/\/max-drake.cc\/wp-content\/uploads\/2018\/07\/im33-1.jpg 1008w, https:\/\/max-drake.cc\/wp-content\/uploads\/2018\/07\/im33-1-300x228.jpg 300w, https:\/\/max-drake.cc\/wp-content\/uploads\/2018\/07\/im33-1-768x584.jpg 768w\" data-sizes=\"(max-width: 1473px) 100vw, 1473px\" src=\"data:image\/svg+xml;base64,PHN2ZyB3aWR0aD0iMSIgaGVpZ2h0PSIxIiB4bWxucz0iaHR0cDovL3d3dy53My5vcmcvMjAwMC9zdmciPjwvc3ZnPg==\" style=\"--smush-placeholder-width: 1473px; --smush-placeholder-aspect-ratio: 1473\/1121;\" \/><\/p>\n<p>Right click on first node and connect it to the 2nd by selecting dataset (in this example , but instance in the last slide).<\/p>\n<p><img decoding=\"async\" class=\"wp-image-3887 aligncenter lazyload\" data-src=\"https:\/\/max-drake.cc\/wp-content\/uploads\/2018\/07\/im34.jpg\" alt=\"\" width=\"1469\" height=\"1122\" data-srcset=\"https:\/\/max-drake.cc\/wp-content\/uploads\/2018\/07\/im34.jpg 996w, https:\/\/max-drake.cc\/wp-content\/uploads\/2018\/07\/im34-300x229.jpg 300w, https:\/\/max-drake.cc\/wp-content\/uploads\/2018\/07\/im34-768x587.jpg 768w\" data-sizes=\"(max-width: 1469px) 100vw, 1469px\" src=\"data:image\/svg+xml;base64,PHN2ZyB3aWR0aD0iMSIgaGVpZ2h0PSIxIiB4bWxucz0iaHR0cDovL3d3dy53My5vcmcvMjAwMC9zdmciPjwvc3ZnPg==\" style=\"--smush-placeholder-width: 1469px; --smush-placeholder-aspect-ratio: 1469\/1122;\" \/><\/p>\n<p>Connect nodes together<\/p>\n<p><img decoding=\"async\" class=\"wp-image-3886 aligncenter lazyload\" data-src=\"https:\/\/max-drake.cc\/wp-content\/uploads\/2018\/07\/im35.jpg\" alt=\"\" width=\"1464\" height=\"1101\" data-srcset=\"https:\/\/max-drake.cc\/wp-content\/uploads\/2018\/07\/im35.jpg 1012w, https:\/\/max-drake.cc\/wp-content\/uploads\/2018\/07\/im35-300x226.jpg 300w, https:\/\/max-drake.cc\/wp-content\/uploads\/2018\/07\/im35-768x578.jpg 768w\" data-sizes=\"(max-width: 1464px) 100vw, 1464px\" src=\"data:image\/svg+xml;base64,PHN2ZyB3aWR0aD0iMSIgaGVpZ2h0PSIxIiB4bWxucz0iaHR0cDovL3d3dy53My5vcmcvMjAwMC9zdmciPjwvc3ZnPg==\" style=\"--smush-placeholder-width: 1464px; --smush-placeholder-aspect-ratio: 1464\/1101;\" \/><\/p>\n<p>Create a workflow by connecting nodes together for 1\/Dataset, 2\/ Actions\/Processes, 3\/ output (Text visualiser\/Graphics). Then hit the RUN icon and right click the Visualisers to show results in pop-up boxes.<\/p>\n<p><img decoding=\"async\" class=\"wp-image-3885 aligncenter lazyload\" data-src=\"https:\/\/max-drake.cc\/wp-content\/uploads\/2018\/07\/im36-841x1024.jpg\" alt=\"\" width=\"1475\" height=\"1797\" data-srcset=\"https:\/\/max-drake.cc\/wp-content\/uploads\/2018\/07\/im36-841x1024.jpg 841w, https:\/\/max-drake.cc\/wp-content\/uploads\/2018\/07\/im36-246x300.jpg 246w, https:\/\/max-drake.cc\/wp-content\/uploads\/2018\/07\/im36-768x935.jpg 768w, https:\/\/max-drake.cc\/wp-content\/uploads\/2018\/07\/im36.jpg 1123w\" data-sizes=\"(max-width: 1475px) 100vw, 1475px\" src=\"data:image\/svg+xml;base64,PHN2ZyB3aWR0aD0iMSIgaGVpZ2h0PSIxIiB4bWxucz0iaHR0cDovL3d3dy53My5vcmcvMjAwMC9zdmciPjwvc3ZnPg==\" style=\"--smush-placeholder-width: 1475px; --smush-placeholder-aspect-ratio: 1475\/1797;\" \/><\/p>\n<p>Once you have the graphical display, you can choose to show it by changing X and Y axis attributes.<\/p>\n<p><img decoding=\"async\" class=\"wp-image-3884 aligncenter lazyload\" data-src=\"https:\/\/max-drake.cc\/wp-content\/uploads\/2018\/07\/im37-616x1024.jpg\" alt=\"\" width=\"1453\" height=\"2415\" data-srcset=\"https:\/\/max-drake.cc\/wp-content\/uploads\/2018\/07\/im37-616x1024.jpg 616w, https:\/\/max-drake.cc\/wp-content\/uploads\/2018\/07\/im37-180x300.jpg 180w\" data-sizes=\"(max-width: 1453px) 100vw, 1453px\" src=\"data:image\/svg+xml;base64,PHN2ZyB3aWR0aD0iMSIgaGVpZ2h0PSIxIiB4bWxucz0iaHR0cDovL3d3dy53My5vcmcvMjAwMC9zdmciPjwvc3ZnPg==\" style=\"--smush-placeholder-width: 1453px; --smush-placeholder-aspect-ratio: 1453\/2415;\" \/><\/p>\n<p>The workflow below is based on INSTANCE instead of dataset. This allows data to flow in so could end with an infinite dataset as the data is not stored in memory . (A comment on output. I couldn&#8217;t change the graph size , maybe I needed to save the results and in the saved file modify the size).<\/p>\n<p><img decoding=\"async\" class=\"wp-image-3883 aligncenter lazyload\" data-src=\"https:\/\/max-drake.cc\/wp-content\/uploads\/2018\/07\/im38-1024x1008.jpg\" alt=\"\" width=\"1628\" height=\"1602\" data-srcset=\"https:\/\/max-drake.cc\/wp-content\/uploads\/2018\/07\/im38-1024x1008.jpg 1024w, https:\/\/max-drake.cc\/wp-content\/uploads\/2018\/07\/im38-300x295.jpg 300w, https:\/\/max-drake.cc\/wp-content\/uploads\/2018\/07\/im38-768x756.jpg 768w, https:\/\/max-drake.cc\/wp-content\/uploads\/2018\/07\/im38.jpg 1128w\" data-sizes=\"(max-width: 1628px) 100vw, 1628px\" src=\"data:image\/svg+xml;base64,PHN2ZyB3aWR0aD0iMSIgaGVpZ2h0PSIxIiB4bWxucz0iaHR0cDovL3d3dy53My5vcmcvMjAwMC9zdmciPjwvc3ZnPg==\" style=\"--smush-placeholder-width: 1628px; --smush-placeholder-aspect-ratio: 1628\/1602;\" \/><\/p>\n<h2>Command Line Interface (CLI)<\/h2>\n<p>I would prefer using the other options so would not use it unless I had to, so I will not go into it. Refer to the video if you are interested. <a href=\"https:\/\/www.youtube.com\/watch?v=8Sl4mjFwiSE&amp;amp;list=PLm4W7_iX_v4OMSgc8xowC2h70s-unJKCp&amp;amp;index=6\" target=\"_blank\" rel=\"noopener\"><strong>This video<\/strong><\/a> discusses CLI and also linking to a Database.<\/p>\n<h3>&nbsp;<img decoding=\"async\" class=\"wp-image-3898 aligncenter lazyload\" data-src=\"https:\/\/max-drake.cc\/wp-content\/uploads\/2018\/07\/im39-1003x1024.jpg\" alt=\"\" width=\"1691\" height=\"1726\" data-srcset=\"https:\/\/max-drake.cc\/wp-content\/uploads\/2018\/07\/im39-1003x1024.jpg 1003w, https:\/\/max-drake.cc\/wp-content\/uploads\/2018\/07\/im39-294x300.jpg 294w, https:\/\/max-drake.cc\/wp-content\/uploads\/2018\/07\/im39-768x784.jpg 768w, https:\/\/max-drake.cc\/wp-content\/uploads\/2018\/07\/im39.jpg 1008w\" data-sizes=\"(max-width: 1691px) 100vw, 1691px\" src=\"data:image\/svg+xml;base64,PHN2ZyB3aWR0aD0iMSIgaGVpZ2h0PSIxIiB4bWxucz0iaHR0cDovL3d3dy53My5vcmcvMjAwMC9zdmciPjwvc3ZnPg==\" style=\"--smush-placeholder-width: 1691px; --smush-placeholder-aspect-ratio: 1691\/1726;\" \/><\/h3>\n<h2>Weka algorithms for Implementation. A comment on the tool<\/h2>\n<p>Whilst going through the <em>Data Mining with Weka<\/em> and <em>More Data Mining with Weka<\/em>&nbsp; videos that demonstrate the use of Weka in using Algorithms on datasets to predict outcomes. After setting up and pressing the buttons, something happens and a % prediction is given.<\/p>\n<p>For some of the algorithms I cannot see what formula\/algorithm is constructed that you can then use to build a predictor in say python or some other programming tool such as Knime.<\/p>\n<p>For <a href=\"https:\/\/en.wikipedia.org\/wiki\/Simple_linear_regression\" target=\"_blank\" rel=\"noopener\">Simple&nbsp;Linear Regression<\/a> you can get the start point and slope of line to predict the y variable. So you can plug the formula in for numeric data to get the results.<\/p>\n<p>In the <a href=\"https:\/\/www.youtube.com\/playlist?list=PLm4W7_iX_v4OMSgc8xowC2h70s-unJKCp\" target=\"_blank\" rel=\"noopener\"><strong>More Data Mining with Weka<\/strong><\/a> videos class 3 relating to Classification Rules, Association Rules and clustering the <a href=\"https:\/\/www.youtube.com\/watch?v=ckPh9jYaRWM&amp;list=PLm4W7_iX_v4OMSgc8xowC2h70s-unJKCp&amp;index=15\" target=\"_blank\" rel=\"noopener\">2nd video<\/a> talks about the rules.Part &amp;&nbsp; rules.JRip method of making a Decision Rules. This is the first time I have seen something for nominal data, I believe I could implement,&nbsp; after testing on a Training Set &amp; a Test Set. The demonstration in the lesson had a 74% probability of predicting the class correctly working with 8 attributes to predict the class.<\/p>\n<p>So whilst Weka seems to be a good analysis\/testing tool I cannot see a practical way of turning its results into a workflow for implementing easily. I think I would try and reproduce using the algorithms in Knime to build something practical. There is the Knowledge Flow process but I prefer Knime as it is easier to build. So test in Weka, build in Knime would be my thoughts on implementing some Machine Learning Algorithms.<\/p>\n<h3>End Comment<\/h3>\n<p>I am quite impressed with the variety of interfaces that Weka has. I like the fact that you can use simple Explorer for starting to explore your data and to add different filters and tests to your data, then move on to the Experimenter to do multiple tests and runs and then also to build workflow nodes (like Knime) to set up some processes and also for streaming data. The Command Line Interface is always a backstop if you have issues with an interface and just need the grunt rather than the visuals.\/ I personally like the visuals. I really like the flexibility of the interface.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Following on from the first Weka post, which was based on information gleaned from the Data Mining with Weka course that I followed. This post is based on the following More Data Mining with Weka videos. Some of&nbsp; the screenshots below from the video&#8217;s that have been developed and are presented by Ian Witten of [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":3884,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[40,33,12],"tags":[],"class_list":["post-3864","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-data-analysis","category-knime-orange-rapidminer","category-visualisation"],"featured_image_src":"https:\/\/max-drake.cc\/wp-content\/uploads\/2018\/07\/im37.jpg","featured_image_src_square":"https:\/\/max-drake.cc\/wp-content\/uploads\/2018\/07\/im37.jpg","author_info":{"display_name":"Max Drake","author_link":"https:\/\/max-drake.cc\/?author=1"},"_links":{"self":[{"href":"https:\/\/max-drake.cc\/index.php?rest_route=\/wp\/v2\/posts\/3864","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/max-drake.cc\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/max-drake.cc\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/max-drake.cc\/index.php?rest_route=\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/max-drake.cc\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=3864"}],"version-history":[{"count":0,"href":"https:\/\/max-drake.cc\/index.php?rest_route=\/wp\/v2\/posts\/3864\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/max-drake.cc\/index.php?rest_route=\/wp\/v2\/media\/3884"}],"wp:attachment":[{"href":"https:\/\/max-drake.cc\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=3864"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/max-drake.cc\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=3864"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/max-drake.cc\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=3864"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}