Fundamentals of Data Science: Python Basics and Lab Exercises

School

Amity University Dubai**We aren't endorsed by this school

Course

CSE DBMS

Subject

Computer Science

Date

Dec 10, 2024

Pages

Uploaded by DeanNeutron18886

FUNDAMENTALS OF DATA SCIENCE AND ANALYTICS CSIT366LAB FILE Department of Computer Science and EngineeringAMITY SCHOOL OF ENGINEERING AND TECHNOLOGYAMITY UNIVERSITY DUBAIBatch (2022-2025) (Semester 5)Bachelor of Science in Information TechnologyUnder the guidance of:Faculty: Ms Archana PanditaLab Instructor: Munaze Sir Submitted by: Josue Claver ManamouEnrolment no: A92604922035University Email Address: josueM@amitydubai.aeSubmission Date: 09/12/20241) Practice printing, string concatenation, and if-else ladder# Printing and string concatenationname = "John"age = 25print("Hello, my name is " + name + " and I am " + str(age) + " years old.")12345678910111213141516171819202122232425262728

# If-else laddernumber = 45if number > 50:print("The number is greater than 50.")elif number == 50:print("The number is exactly 50.")else:print("The number is less than 50.")---2) List Operations# Typecastingx = "10"y = int(x) + 5print("Typecasted value:", y)# Assigning and printing multiple valuesa, b, c = 1, 2, 3print("a:", a, "b:", b, "c:", c)# Creating and modifying a listmy_list = [1, 2, 3]my_list.append(4)my_list[1] = 10print("Updated list:", my_list)# Accessing an indexprint("Element at index 2:", my_list[2])# Using pop and removemy_list.pop(1) # Removes element at index 1my_list.remove(3) # Removes element with value 3print("After removal:", my_list)Page 2 of 15293031323334353637383940414243444546474849505152535455565758596061626364

# Clearing and deleting a listmy_list.clear()print("Cleared list:", my_list)del my_list---3) Tuples# Creating and accessing tuplesmy_tuple = (10, 20, 30, 40)print("Element at index 2:", my_tuple[2])---4) List Comprehension# Creating and modifying a list using list comprehensionsquares = [x**2 for x in range(1, 11)]print("Squares of numbers from 1 to 10:", squares)# Filter even numbers using list comprehensioneven_numbers = [x for x in range(1, 11) if x % 2 == 0]print("Even numbers:", even_numbers)---5) Typecasting, Packing, Unpacking, and Sets# Typecastingfloat_value = float(5)print("Float value:", float_value)# Packing and unpackingnumbers = 1, 2, 3 # Packingx, y, z = numbers # Unpackingprint("Unpacked values:", x, y, z)# SetsPage 3 of 156566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107

my_set = {1, 2, 3, 4}my_set.add(5)my_set.remove(2)print("Updated set:", my_set)---6) Create and perform operations on a text file# Create and write to a text filewith open("example.txt", "w") as file:file.write("Hello, this is a sample text file.")# Read the content of the text filewith open("example.txt", "r") as file:content = file.read()print("File content:", content)---7) File Operations (Read and Write)# Writing to a filewith open("data.txt", "w") as file:file.write("Writing to a file in Python.\n")file.write("This is the second line.\n")# Reading from a filewith open("data.txt", "r") as file:for line in file:print(line.strip())---Page 4 of 15108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144

8) Programsa) Conditional statement for odd and evennumber = int(input("Enter a number: "))if number % 2 == 0:print("The number is even.")else:print("The number is odd.")b) String declarationmy_string = "Hello, Python!"print("String:", my_string)c) Accept two numbers and swap using a third variablea = int(input("Enter first number: "))b = int(input("Enter second number: "))temp = aa = bb = tempprint("After swapping: a =", a, ", b =", b)d) Swap values of two variables by computationa = int(input("Enter first number: "))b = int(input("Enter second number: "))a = a + bb = a - ba = a - bprint("After swapping: a =", a, ", b =", b)e) Swap values of two variables by assignment operatora = int(input("Enter first number: "))b = int(input("Enter second number: "))a, b = b, aprint("After swapping: a =", a, ", b =", b)f) Python calculator for miles per hourdistance = float(input("Enter distance in miles: "))Page 5 of 15145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187

time = float(input("Enter time in hours: "))mph = distance / timeprint("Speed in miles per hour:", mph)PROGRAM 9:AIM:- To develop a heart disease prediction model using SVMSOFTWARE USED:- Jupyter NotebookCODE:#Import python packagesimport numpy as np import pandas as pd import seaborn as snsimport matplotlib.pyplot as pltfrom sklearn.model_selection import train_test_split # Import train_test_split functionfrom sklearn import svm #Import svm modelfrom sklearn import metrics #Import scikit-learn metrics module for accuracy calculationfrom sklearn.metrics import confusion_matrix#Import the heart datadata = pd.read_csv("heart.csv")#Display first 5 lines of heart dataPage 6 of 15188189190191192193194195196197198199200201202203204205206207208209210211212

data.head()#Display basic statistics of datadata.describe()#Display basic info about the datadata.info()#Separate Feature and Target Matrixx = data.drop('output',axis = 1) y = data.outputx.head()# Split dataset into training set and test setx_train, x_test, y_train, y_test = train_test_split(x, y, test_size=0.3,random_state=109) # 70%training and 30% test#Create a svm Classifierml = svm.SVC(kernel='linear') # Linear Kernel#Train the model using the training setsml.fit(x_train, y_train)#Predict the response for test datasety_pred = ml.predict(x_test)# Model Accuracy: how often is the classifier correct?ml.score(x_test,y_test)confusion_matrix(y_test,y_pred)Page 7 of 15213214215216217218219220221222223224225226227228229230231232233234235236237238239

OUTPUT:Page 8 of 15240241242

Page 9 of 15243244

PROGRAM10:AIM: To develop a movie recommendation systemSOFTWARE USED: jupyter notebookCODE:import pandas as pdimport numpy as npfrom sklearn.feature_extraction.text import CountVectorizerfrom sklearn.metrics.pairwise import cosine_similaritydf = pd.read_csv("movie_dataset.csv")df.head()Page 10 of 15245246247248249250251252253254255256257258259260261

df.describe()print(df.columns.values)features = ['genres', 'keywords', 'title', 'cast', 'director']df['cast'].isnull().values.any()def combine_features(row):return row['title']+' '+row['genres']+' '+row['director']+' '+row['keywords']+' '+row['cast']for feature in features:df[feature] = df[feature].fillna('')df['combined_features'] = df.apply(combine_features, axis = 1)print(df.loc[0, 'combined_features'])cv = CountVectorizer()count_matrix = cv.fit_transform(df['combined_features'])cosine_sim = cosine_similarity(count_matrix)def get_title_from_index(index):return df[df.index == index]["title"].values[0]def get_index_from_title(title):return df[df.title == title]["index"].values[0]movie_user_likes = "Star Trek Beyond"movie_index = get_index_from_title(movie_user_likes)similar_movies = list(enumerate(cosine_sim[movie_index])) #accessing the row corresponding to given movie to find all the similarity scores for that movie and then enumerating over itsorted_similar_movies = sorted(similar_movies,key=lambda x:x[1],reverse=True)[1:]i=0print("Top 10 similar movies to "+movie_user_likes+" are:\n")for element in sorted_similar_movies:print(get_title_from_index(element[0]))Page 11 of 15262263264265266267268269270271272273274275276277278279280281282283284285286287288

i=i+1if i>10:BreakOUTPUT:Page 12 of 15289290291292293294295296297298299300

Page 13 of 15301302303

Page 14 of 15304

Page 15 of 15305