Orange is a Gui based tool which is great for visualizing patterns and understanding their data . It allows user to do these things without the need to code .
Why orange ?
- It’s easy to use , any professionals can use it because of the absence of coding .
- Basic visualization , data manipulation ,transformation and mining can be one in a single workflow.
- It has some wonderful visuals , which makes presentation appealing .
Getting Started with orange
Download Orange distribution package and run the installation file on your local computer from here
Download Link : https://orange.biolab.si/download/
After installation is done you should be able to run the orange , just locate the orange icon and click it .
When you run orange for the first time you should greeted by something similar like this .
You can click Tutorial to browse through tutorial to watch tutorials on youtube and click Examples for reloaded workflow.
After selecting Examples
- You can choose any of the preloaded data mining workflows
- For this module we will chose hierarchical clustering.
- Selected tutorial will open in Orange canvas. In Orange, data mining workflows consist of computational components called widgets. Widgets do all the work and exchange information. They can communicate through channels. In the workflow below, the File widget sends its data to the Data Table widget and Distance widget, which, in turn, communicates the computed distances to two other widgets in the workflow .
- The file widget on top left hand corner reads the from your computer and sends data to other widget
- If you click( double ) the file , it will open up a window from where you can browse through documentation data sets to browse through , for this one from the pre installed data files select titanic.tab
- From this file we will predict the probability of survival of passenger based on the information we get from the file itself .
- Select the little curve around the File Widget to select other widgets to send data to , for this one we will send data from file widget to Data table and Sieve Diagram .
- Now doubleclick sieve diagram to visualize survival probabilities against expected ones . Now you can play with the combination of attributes to get answers to following questions .
- Lowest Probability of survival based on class , sex and age .
- Who had higher probability of survival the crew or the first class passengers ?
Now play around with other preloaded files to learn more .
Using External Files in Orange
Firstly We will to pick a file operator , from that file operator we will have to open the file we downloaded from Kaggle .
Note : Depending on the file we downloaded , it might take some time to load and work with other operators . Larger the file , more the time .
For the sake of this tutorial we will be using these two files , these excel files . For each files we will need two different file operator , repeat the method for both files .
From the figure we can see exactly the number of attributes available for each files .Since these files are inserted we can start visualising .
By using distribution attribute we can visualise the distribution of Latitude among the states in USA .
Or by using the scatter plot , we can plot the scatter plot of longitude among the states
Or else just by using Data Table we can create one table from each file .
Lets work with another flight data , this time with a bigger file and it will take a lot longer to work with operators or just to load data depending on your computer .
Let’s load the adsb.csv
Connect the file data table to see the table
And then load up some charts
Written by Masud Imran