{"id":3658,"date":"2018-06-21T20:14:23","date_gmt":"2018-06-21T20:14:23","guid":{"rendered":"https:\/\/max-drake.cc\/?p=3658"},"modified":"2018-06-29T15:24:57","modified_gmt":"2018-06-29T15:24:57","slug":"orange-2-image-clustering","status":"publish","type":"post","link":"https:\/\/max-drake.cc\/?p=3658","title":{"rendered":"Orange 2. Image Clustering"},"content":{"rendered":"<p>I watched this <a href=\"https:\/\/www.youtube.com\/watch?v=Iu8g2Twjn9U&amp;list=PLmNPvQr9Tf-ZSDLwOzxpvY-HrE0yv-8Fy&amp;index=14\" target=\"_blank\" rel=\"noopener\"><strong>Orange Tutorial <\/strong><\/a>and thought I&#8217;d like to give it a try. The Image Embedder goes back and processes the images based on an online database. It then adds more features to your initial data.<\/p>\n<p>The Image Analytics Tab is an ADD-ON. This was initially blocked and I had to run Orange in Administration Mode to access the downloads.<\/p>\n<p>If you plug in a Data Table initially to the Image Loader and one after the Image Embedder you will see that a lot more columns with information have been added to each image (2047).<\/p>\n<p>This workflow in Orange seems easy to use (if you watch the videos on workflows). You do get results. But very deep things are happening that are not too clear on the surface, so actually manipulating\/altering the information is, currently, beyond my understanding.<\/p>\n<h3>Process<\/h3>\n<p>I selected a general folder of photographs, just to test the process<\/p>\n<p><img decoding=\"async\" class=\"wp-image-3664 aligncenter lazyload\" data-src=\"https:\/\/max-drake.cc\/wp-content\/uploads\/2018\/06\/im19-1024x404.jpg\" alt=\"\" width=\"1470\" height=\"579\" data-srcset=\"https:\/\/max-drake.cc\/wp-content\/uploads\/2018\/06\/im19-1024x404.jpg 1024w, https:\/\/max-drake.cc\/wp-content\/uploads\/2018\/06\/im19-300x118.jpg 300w, https:\/\/max-drake.cc\/wp-content\/uploads\/2018\/06\/im19-768x303.jpg 768w\" data-sizes=\"(max-width: 1470px) 100vw, 1470px\" src=\"data:image\/svg+xml;base64,PHN2ZyB3aWR0aD0iMSIgaGVpZ2h0PSIxIiB4bWxucz0iaHR0cDovL3d3dy53My5vcmcvMjAwMC9zdmciPjwvc3ZnPg==\" style=\"--smush-placeholder-width: 1470px; --smush-placeholder-aspect-ratio: 1470\/579;\" \/><\/p>\n<p>Orange setup was as per video<\/p>\n<p><img decoding=\"async\" class=\"wp-image-3663 aligncenter lazyload\" data-src=\"https:\/\/max-drake.cc\/wp-content\/uploads\/2018\/06\/im19a-1024x487.jpg\" alt=\"\" width=\"1457\" height=\"692\" data-srcset=\"https:\/\/max-drake.cc\/wp-content\/uploads\/2018\/06\/im19a-1024x487.jpg 1024w, https:\/\/max-drake.cc\/wp-content\/uploads\/2018\/06\/im19a-300x143.jpg 300w, https:\/\/max-drake.cc\/wp-content\/uploads\/2018\/06\/im19a-768x365.jpg 768w\" data-sizes=\"(max-width: 1457px) 100vw, 1457px\" src=\"data:image\/svg+xml;base64,PHN2ZyB3aWR0aD0iMSIgaGVpZ2h0PSIxIiB4bWxucz0iaHR0cDovL3d3dy53My5vcmcvMjAwMC9zdmciPjwvc3ZnPg==\" style=\"--smush-placeholder-width: 1457px; --smush-placeholder-aspect-ratio: 1457\/692;\" \/><\/p>\n<p>For Distances video advised &#8220;Cosine &#8221; Process.<\/p>\n<p>&nbsp;<img decoding=\"async\" class=\"size-large wp-image-3670 aligncenter lazyload\" data-src=\"https:\/\/max-drake.cc\/wp-content\/uploads\/2018\/06\/im23-913x1024.jpg\" alt=\"\" width=\"678\" height=\"760\" data-srcset=\"https:\/\/max-drake.cc\/wp-content\/uploads\/2018\/06\/im23-913x1024.jpg 913w, https:\/\/max-drake.cc\/wp-content\/uploads\/2018\/06\/im23-267x300.jpg 267w, https:\/\/max-drake.cc\/wp-content\/uploads\/2018\/06\/im23-768x862.jpg 768w, https:\/\/max-drake.cc\/wp-content\/uploads\/2018\/06\/im23.jpg 925w\" data-sizes=\"(max-width: 678px) 100vw, 678px\" src=\"data:image\/svg+xml;base64,PHN2ZyB3aWR0aD0iMSIgaGVpZ2h0PSIxIiB4bWxucz0iaHR0cDovL3d3dy53My5vcmcvMjAwMC9zdmciPjwvc3ZnPg==\" style=\"--smush-placeholder-width: 678px; --smush-placeholder-aspect-ratio: 678\/760;\" \/><\/p>\n<p>For Hierarchical Clustering Linkage &#8220;Ward&#8221;.<\/p>\n<p><img decoding=\"async\" class=\"wp-image-3669 aligncenter lazyload\" data-src=\"https:\/\/max-drake.cc\/wp-content\/uploads\/2018\/06\/im24-1024x999.jpg\" alt=\"\" width=\"881\" height=\"859\" data-srcset=\"https:\/\/max-drake.cc\/wp-content\/uploads\/2018\/06\/im24-1024x999.jpg 1024w, https:\/\/max-drake.cc\/wp-content\/uploads\/2018\/06\/im24-300x293.jpg 300w, https:\/\/max-drake.cc\/wp-content\/uploads\/2018\/06\/im24-768x749.jpg 768w, https:\/\/max-drake.cc\/wp-content\/uploads\/2018\/06\/im24.jpg 1846w\" data-sizes=\"(max-width: 881px) 100vw, 881px\" src=\"data:image\/svg+xml;base64,PHN2ZyB3aWR0aD0iMSIgaGVpZ2h0PSIxIiB4bWxucz0iaHR0cDovL3d3dy53My5vcmcvMjAwMC9zdmciPjwvc3ZnPg==\" style=\"--smush-placeholder-width: 881px; --smush-placeholder-aspect-ratio: 881\/859;\" \/><\/p>\n<p>As part of the process, from the original file which has 5 column headers (see top data table on image below) , The Embedder creates another 2047 columns! (see bottom data table on image below). So it must be doing a lot of analysis in behind the scenes, and its not taking hours to generate.<\/p>\n<p><img decoding=\"async\" class=\"wp-image-3673 aligncenter lazyload\" data-src=\"https:\/\/max-drake.cc\/wp-content\/uploads\/2018\/06\/im25-1024x666.jpg\" alt=\"\" width=\"1490\" height=\"969\" data-srcset=\"https:\/\/max-drake.cc\/wp-content\/uploads\/2018\/06\/im25-1024x666.jpg 1024w, https:\/\/max-drake.cc\/wp-content\/uploads\/2018\/06\/im25-300x195.jpg 300w, https:\/\/max-drake.cc\/wp-content\/uploads\/2018\/06\/im25-768x500.jpg 768w\" data-sizes=\"(max-width: 1490px) 100vw, 1490px\" src=\"data:image\/svg+xml;base64,PHN2ZyB3aWR0aD0iMSIgaGVpZ2h0PSIxIiB4bWxucz0iaHR0cDovL3d3dy53My5vcmcvMjAwMC9zdmciPjwvc3ZnPg==\" style=\"--smush-placeholder-width: 1490px; --smush-placeholder-aspect-ratio: 1490\/969;\" \/><\/p>\n<p>Some of the results were very close, selecting part of the hierarchical tree you can then view the results in the Image Viewer.<\/p>\n<p>The following one was pretty close, The Eionstein Tower &amp; Gugenheim Bilbao. Both builsdings having a lot of curves.<\/p>\n<p><img decoding=\"async\" class=\"wp-image-3666 aligncenter lazyload\" data-src=\"https:\/\/max-drake.cc\/wp-content\/uploads\/2018\/06\/im21-1024x594.jpg\" alt=\"\" width=\"1453\" height=\"842\" data-srcset=\"https:\/\/max-drake.cc\/wp-content\/uploads\/2018\/06\/im21-1024x594.jpg 1024w, https:\/\/max-drake.cc\/wp-content\/uploads\/2018\/06\/im21-300x174.jpg 300w, https:\/\/max-drake.cc\/wp-content\/uploads\/2018\/06\/im21-768x446.jpg 768w\" data-sizes=\"(max-width: 1453px) 100vw, 1453px\" src=\"data:image\/svg+xml;base64,PHN2ZyB3aWR0aD0iMSIgaGVpZ2h0PSIxIiB4bWxucz0iaHR0cDovL3d3dy53My5vcmcvMjAwMC9zdmciPjwvc3ZnPg==\" style=\"--smush-placeholder-width: 1453px; --smush-placeholder-aspect-ratio: 1453\/842;\" \/><\/p>\n<p>Another cluster that seemed very close was this grouping of castles and monuments. <img decoding=\"async\" class=\"wp-image-3667 aligncenter lazyload\" data-src=\"https:\/\/max-drake.cc\/wp-content\/uploads\/2018\/06\/im20-1024x599.jpg\" alt=\"\" width=\"1436\" height=\"841\" data-srcset=\"https:\/\/max-drake.cc\/wp-content\/uploads\/2018\/06\/im20-1024x599.jpg 1024w, https:\/\/max-drake.cc\/wp-content\/uploads\/2018\/06\/im20-300x176.jpg 300w, https:\/\/max-drake.cc\/wp-content\/uploads\/2018\/06\/im20-768x449.jpg 768w\" data-sizes=\"(max-width: 1436px) 100vw, 1436px\" src=\"data:image\/svg+xml;base64,PHN2ZyB3aWR0aD0iMSIgaGVpZ2h0PSIxIiB4bWxucz0iaHR0cDovL3d3dy53My5vcmcvMjAwMC9zdmciPjwvc3ZnPg==\" style=\"--smush-placeholder-width: 1436px; --smush-placeholder-aspect-ratio: 1436\/841;\" \/><\/p>\n<p>This major grouping almost seems like a bin of &#8220;The others&#8221;. There does not seem to be much consistency with the images. Maybe on a more granular level there will be better groupings<img decoding=\"async\" class=\"wp-image-3665 aligncenter lazyload\" data-src=\"https:\/\/max-drake.cc\/wp-content\/uploads\/2018\/06\/im22-1024x653.jpg\" alt=\"\" width=\"1466\" height=\"934\" data-srcset=\"https:\/\/max-drake.cc\/wp-content\/uploads\/2018\/06\/im22-1024x653.jpg 1024w, https:\/\/max-drake.cc\/wp-content\/uploads\/2018\/06\/im22-300x191.jpg 300w, https:\/\/max-drake.cc\/wp-content\/uploads\/2018\/06\/im22-768x490.jpg 768w\" data-sizes=\"(max-width: 1466px) 100vw, 1466px\" src=\"data:image\/svg+xml;base64,PHN2ZyB3aWR0aD0iMSIgaGVpZ2h0PSIxIiB4bWxucz0iaHR0cDovL3d3dy53My5vcmcvMjAwMC9zdmciPjwvc3ZnPg==\" style=\"--smush-placeholder-width: 1466px; --smush-placeholder-aspect-ratio: 1466\/934;\" \/><\/p>\n<p>You can alter the clustering depth by changing the numbers in the Hierarchical Clustering node&nbsp; in the left Sidebar which will redefine the cluster groupings.&nbsp;<\/p>\n<p><img decoding=\"async\" class=\"wp-image-3672 aligncenter lazyload\" data-src=\"https:\/\/max-drake.cc\/wp-content\/uploads\/2018\/06\/im26-1024x652.jpg\" alt=\"\" width=\"1546\" height=\"985\" data-srcset=\"https:\/\/max-drake.cc\/wp-content\/uploads\/2018\/06\/im26-1024x652.jpg 1024w, https:\/\/max-drake.cc\/wp-content\/uploads\/2018\/06\/im26-300x191.jpg 300w, https:\/\/max-drake.cc\/wp-content\/uploads\/2018\/06\/im26-768x489.jpg 768w\" data-sizes=\"(max-width: 1546px) 100vw, 1546px\" src=\"data:image\/svg+xml;base64,PHN2ZyB3aWR0aD0iMSIgaGVpZ2h0PSIxIiB4bWxucz0iaHR0cDovL3d3dy53My5vcmcvMjAwMC9zdmciPjwvc3ZnPg==\" style=\"--smush-placeholder-width: 1546px; --smush-placeholder-aspect-ratio: 1546\/985;\" \/><\/p>\n<p>After adjusting the images to how you want to cluster them you can save the image files with the SAVE DATA node. This actually spat out all 2047 columns plus the original plus the clustering codes to a CSV file. So I deleted all the 2047 columns and kept the clustering Codes. There are also the file names, so you could create an Excel VBA macro that will copy the files to another directory with the Cluster as the Directory (eg C1, C2, C3&#8230;&#8230;) using a process such as <a href=\"https:\/\/analysistabs.com\/excel-vba\/copy-files-one-location-another-folder-directory\/\" target=\"_blank\" rel=\"noopener\">this tutorial<\/a>.<\/p>\n<p>Actually you could do it quite easily with a Python Script (just wondering whether its possible to embed a python script within Orange to do this, (that would be neat)).<\/p>\n<p><img decoding=\"async\" class=\"alignnone wp-image-3675 lazyload\" data-src=\"https:\/\/max-drake.cc\/wp-content\/uploads\/2018\/06\/im27-1024x659.jpg\" alt=\"\" width=\"1484\" height=\"954\" data-srcset=\"https:\/\/max-drake.cc\/wp-content\/uploads\/2018\/06\/im27-1024x659.jpg 1024w, https:\/\/max-drake.cc\/wp-content\/uploads\/2018\/06\/im27-300x193.jpg 300w, https:\/\/max-drake.cc\/wp-content\/uploads\/2018\/06\/im27-768x494.jpg 768w\" data-sizes=\"(max-width: 1484px) 100vw, 1484px\" src=\"data:image\/svg+xml;base64,PHN2ZyB3aWR0aD0iMSIgaGVpZ2h0PSIxIiB4bWxucz0iaHR0cDovL3d3dy53My5vcmcvMjAwMC9zdmciPjwvc3ZnPg==\" style=\"--smush-placeholder-width: 1484px; --smush-placeholder-aspect-ratio: 1484\/954;\" \/><\/p>\n<p>So if you had a group of images you could group them by cluster, then inspect the clusters. The ones that are well grouped you&#8217;d keep. For the ones that did not cluster well you could run through as a separate process , and also test out different DISTANCES algorithms, instead of the &#8220;Cosine&#8221; one selected, to see if you could cluster them more effectively.<\/p>\n<h3>End Comment<\/h3>\n<p>I&#8217;m not sure I have a practical application for this workflow at this time. I&#8217;m impressed with how it works and how quickly it does the processing.&nbsp;<\/p>\n<p>You could see the process being used for a security camera with still pictures at an entrance lobby, a bit scary in ways.<\/p>\n<p>There is <strong><a href=\"https:\/\/www.youtube.com\/watch?v=6srGs5w9x8w\" target=\"_blank\" rel=\"noopener\">another video<\/a><\/strong> on matching image styles matching 2 monet paintings.<\/p>\n<p>I do like the short videos for tutorials but some more in-depth videos would be good, explaining what is going on and how to fix when broken.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>I watched this Orange Tutorial and thought I&#8217;d like to give it a try. The Image Embedder goes back and processes the images based on an online database. It then adds more features to your initial data. The Image Analytics Tab is an ADD-ON. This was initially blocked and I had to run Orange in [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":3666,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[40,33,12],"tags":[],"class_list":["post-3658","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-data-analysis","category-knime-orange-rapidminer","category-visualisation"],"featured_image_src":"https:\/\/max-drake.cc\/wp-content\/uploads\/2018\/06\/im21.jpg","featured_image_src_square":"https:\/\/max-drake.cc\/wp-content\/uploads\/2018\/06\/im21.jpg","author_info":{"display_name":"Max Drake","author_link":"https:\/\/max-drake.cc\/?author=1"},"_links":{"self":[{"href":"https:\/\/max-drake.cc\/index.php?rest_route=\/wp\/v2\/posts\/3658","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/max-drake.cc\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/max-drake.cc\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/max-drake.cc\/index.php?rest_route=\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/max-drake.cc\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=3658"}],"version-history":[{"count":0,"href":"https:\/\/max-drake.cc\/index.php?rest_route=\/wp\/v2\/posts\/3658\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/max-drake.cc\/index.php?rest_route=\/wp\/v2\/media\/3666"}],"wp:attachment":[{"href":"https:\/\/max-drake.cc\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=3658"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/max-drake.cc\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=3658"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/max-drake.cc\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=3658"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}