Mastering Naïve Bayes: A Step-by-Step Guide for ICT724 Students

School

University of Wollongong**We aren't endorsed by this school

Course

GEOG 292

Subject

Computer Science

Date

Dec 12, 2024

Pages

Uploaded by mahbub4796

ICT724 Tutorial 2 Created by: Dr. Hui Wu 30/10/2024 T3 2024: ICT724 Intelligent Systems Tutorial 5 Submiss requirement: submit a pdf file or a Word file to Moodle Deadline: 23:59 Sunday 08/12/2024. In this tutorial, you will learn to use the Naïve Bayes provided by the AI Studio (formerly known as RapidMiner) of Altair. To download Altair AI Studio, please following the instructions given by https://docs.rapidminer.com/latest/studio/installation/index.html. Next, watch the following videos to understand the basics of AI Studio: 1.Introduction 2.Import data 3.Data cleansing 4.Visualization data Naive Bayes is a high-bias, low-variance classifier, and it can build a good model even with a small data set. It is simple to use and computationally inexpensive. Typical use cases involve text categorization, including spam detection, sentiment analysis, and recommender systems. The fundamental assumption of Naive Bayes is that, given the value of the label (the class), the value of any Attribute is independent of the value of any other Attribute. Strictly speaking, this assumption is rarely true (it's "naive"!), but experience shows that the Naive Bayes classifier often works well. The independence assumption vastly simplifies the calculations needed to build the Naive Bayes probability model. To complete the probability model, it is necessary to make some assumption about the conditional probability distributions for the individual Attributes, given the class. This Operator uses Gaussian probability densities to model the Attribute data. To start the lab, download the “Customer.xlsx” and import it into Rapid Miner.

ICT724 Tutorial 2 Created by: Dr. Hui Wu 30/10/2024 Apply all the necessary operators such as filter example, select attributes, set role, and split data. “split data” is used to divide the data into training and test datasets (80% and 20% usually. In this example because we want to predict the churn value, in the “set role”, we choose the attribute “churn” as label.Now, the Naïve Bayse operator is added to the process panel and the training set is connected to the training input of the operator.

ICT724 Tutorial 2 Created by: Dr. Hui Wu 30/10/2024 After that, the train model will be created using “apply model” operator and we connect the test set also that for validation purpose. If we connect the lab output to the output of process panel, we should be able to see the prediction. See screenshot below: Now you can compare if your model is accurate in the prediction or not. In order to see the accuracy percentage, you can use the operator “performance” and get the following result:

ICT724 Tutorial 2 Created by: Dr. Hui Wu 30/10/2024 The detailed demo of the above process is given here. Task: Follow the above process, and take a screenshot for each step, and include all the screenshots in sequence in your submission. Adding one operator to your process is considered as one step. References https://docs.rapidminer.com/latest/studio/operators/modeling/predictive/bayesian/naive_bayes.html