Fundamentals of Data Science: Python Basics and Lab Exercises
School
Amity University Dubai**We aren't endorsed by this school
Course
CSE DBMS
Subject
Computer Science
Date
Dec 10, 2024
Pages
15
Uploaded by DeanNeutron18886
FUNDAMENTALS OF DATA SCIENCE AND ANALYTICS CSIT366LAB FILE Department of Computer Science and EngineeringAMITY SCHOOL OF ENGINEERING AND TECHNOLOGYAMITY UNIVERSITY DUBAIBatch (2022-2025) (Semester 5)Bachelor of Science in Information TechnologyUnder the guidance of:Faculty: Ms Archana PanditaLab Instructor: Munaze Sir Submitted by: Josue Claver ManamouEnrolment no: A92604922035University Email Address: josueM@amitydubai.aeSubmission Date: 09/12/20241) Practice printing, string concatenation, and if-else ladder# Printing and string concatenationname = "John"age = 25print("Hello, my name is " + name + " and I am " + str(age) + " years old.")12345678910111213141516171819202122232425262728
# If-else laddernumber = 45if number > 50:print("The number is greater than 50.")elif number == 50:print("The number is exactly 50.")else:print("The number is less than 50.")---2) List Operations# Typecastingx = "10"y = int(x) + 5print("Typecasted value:", y)# Assigning and printing multiple valuesa, b, c = 1, 2, 3print("a:", a, "b:", b, "c:", c)# Creating and modifying a listmy_list = [1, 2, 3]my_list.append(4)my_list[1] = 10print("Updated list:", my_list)# Accessing an indexprint("Element at index 2:", my_list[2])# Using pop and removemy_list.pop(1) # Removes element at index 1my_list.remove(3) # Removes element with value 3print("After removal:", my_list)Page 2 of 15293031323334353637383940414243444546474849505152535455565758596061626364
# Clearing and deleting a listmy_list.clear()print("Cleared list:", my_list)del my_list---3) Tuples# Creating and accessing tuplesmy_tuple = (10, 20, 30, 40)print("Element at index 2:", my_tuple[2])---4) List Comprehension# Creating and modifying a list using list comprehensionsquares = [x**2 for x in range(1, 11)]print("Squares of numbers from 1 to 10:", squares)# Filter even numbers using list comprehensioneven_numbers = [x for x in range(1, 11) if x % 2 == 0]print("Even numbers:", even_numbers)---5) Typecasting, Packing, Unpacking, and Sets# Typecastingfloat_value = float(5)print("Float value:", float_value)# Packing and unpackingnumbers = 1, 2, 3 # Packingx, y, z = numbers # Unpackingprint("Unpacked values:", x, y, z)# SetsPage 3 of 156566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107
my_set = {1, 2, 3, 4}my_set.add(5)my_set.remove(2)print("Updated set:", my_set)---6) Create and perform operations on a text file# Create and write to a text filewith open("example.txt", "w") as file:file.write("Hello, this is a sample text file.")# Read the content of the text filewith open("example.txt", "r") as file:content = file.read()print("File content:", content)---7) File Operations (Read and Write)# Writing to a filewith open("data.txt", "w") as file:file.write("Writing to a file in Python.\n")file.write("This is the second line.\n")# Reading from a filewith open("data.txt", "r") as file:for line in file:print(line.strip())---Page 4 of 15108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144
8) Programsa) Conditional statement for odd and evennumber = int(input("Enter a number: "))if number % 2 == 0:print("The number is even.")else:print("The number is odd.")b) String declarationmy_string = "Hello, Python!"print("String:", my_string)c) Accept two numbers and swap using a third variablea = int(input("Enter first number: "))b = int(input("Enter second number: "))temp = aa = bb = tempprint("After swapping: a =", a, ", b =", b)d) Swap values of two variables by computationa = int(input("Enter first number: "))b = int(input("Enter second number: "))a = a + bb = a - ba = a - bprint("After swapping: a =", a, ", b =", b)e) Swap values of two variables by assignment operatora = int(input("Enter first number: "))b = int(input("Enter second number: "))a, b = b, aprint("After swapping: a =", a, ", b =", b)f) Python calculator for miles per hourdistance = float(input("Enter distance in miles: "))Page 5 of 15145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187
time = float(input("Enter time in hours: "))mph = distance / timeprint("Speed in miles per hour:", mph)PROGRAM 9:AIM:- To develop a heart disease prediction model using SVMSOFTWARE USED:- Jupyter NotebookCODE:#Import python packagesimport numpy as np import pandas as pd import seaborn as snsimport matplotlib.pyplot as pltfrom sklearn.model_selection import train_test_split # Import train_test_split functionfrom sklearn import svm #Import svm modelfrom sklearn import metrics #Import scikit-learn metrics module for accuracy calculationfrom sklearn.metrics import confusion_matrix#Import the heart datadata = pd.read_csv("heart.csv")#Display first 5 lines of heart dataPage 6 of 15188189190191192193194195196197198199200201202203204205206207208209210211212
data.head()#Display basic statistics of datadata.describe()#Display basic info about the datadata.info()#Separate Feature and Target Matrixx = data.drop('output',axis = 1) y = data.outputx.head()# Split dataset into training set and test setx_train, x_test, y_train, y_test = train_test_split(x, y, test_size=0.3,random_state=109) # 70%training and 30% test#Create a svm Classifierml = svm.SVC(kernel='linear') # Linear Kernel#Train the model using the training setsml.fit(x_train, y_train)#Predict the response for test datasety_pred = ml.predict(x_test)# Model Accuracy: how often is the classifier correct?ml.score(x_test,y_test)confusion_matrix(y_test,y_pred)Page 7 of 15213214215216217218219220221222223224225226227228229230231232233234235236237238239
OUTPUT:Page 8 of 15240241242
Page 9 of 15243244
PROGRAM10:AIM: To develop a movie recommendation systemSOFTWARE USED: jupyter notebookCODE:import pandas as pdimport numpy as npfrom sklearn.feature_extraction.text import CountVectorizerfrom sklearn.metrics.pairwise import cosine_similaritydf = pd.read_csv("movie_dataset.csv")df.head()Page 10 of 15245246247248249250251252253254255256257258259260261
df.describe()print(df.columns.values)features = ['genres', 'keywords', 'title', 'cast', 'director']df['cast'].isnull().values.any()def combine_features(row):return row['title']+' '+row['genres']+' '+row['director']+' '+row['keywords']+' '+row['cast']for feature in features:df[feature] = df[feature].fillna('')df['combined_features'] = df.apply(combine_features, axis = 1)print(df.loc[0, 'combined_features'])cv = CountVectorizer()count_matrix = cv.fit_transform(df['combined_features'])cosine_sim = cosine_similarity(count_matrix)def get_title_from_index(index):return df[df.index == index]["title"].values[0]def get_index_from_title(title):return df[df.title == title]["index"].values[0]movie_user_likes = "Star Trek Beyond"movie_index = get_index_from_title(movie_user_likes)similar_movies = list(enumerate(cosine_sim[movie_index])) #accessing the row corresponding to given movie to find all the similarity scores for that movie and then enumerating over itsorted_similar_movies = sorted(similar_movies,key=lambda x:x[1],reverse=True)[1:]i=0print("Top 10 similar movies to "+movie_user_likes+" are:\n")for element in sorted_similar_movies:print(get_title_from_index(element[0]))Page 11 of 15262263264265266267268269270271272273274275276277278279280281282283284285286287288
i=i+1if i>10:BreakOUTPUT:Page 12 of 15289290291292293294295296297298299300