CHANDIGARH UNIVERSITY Institute of Distance and Online Learning Course Development Committee Prof. (Dr.) R.S.Bawa Pro Chancellor, Chandigarh University, Gharuan, Punjab Advisors Prof. (Dr.) Bharat Bhushan, Director – IGNOU Prof. (Dr.) Majulika Srivastava, Director – CIQA, IGNOU Programme Coordinators & Editing Team Master of Business Administration (MBA) Bachelor of Business Administration (BBA) Coordinator – Dr. Rupali Arora Coordinator – Dr. Simran Jewandah Master of Computer Applications (MCA) Bachelor of Computer Applications (BCA) Coordinator – Dr. Raju Kumar Coordinator – Dr. Manisha Malhotra Master of Commerce (M.Com.) Bachelor of Commerce (B.Com.) Coordinator – Dr. Aman Jindal Coordinator – Dr. Minakshi Garg Master of Arts (Psychology) Bachelor of Science (Travel &Tourism Management) Coordinator – Dr. Samerjeet Kaur Coordinator – Dr. Shikha Sharma Master of Arts (English) Bachelor of Arts (General) Coordinator – Dr. Ashita Chadha Coordinator – Ms. Neeraj Gohlan Academic and Administrative Management Prof. (Dr.) R. M. Bhagat Prof. (Dr.) S.S. Sehgal Executive Director – Sciences Registrar Prof. (Dr.) Manaswini Acharya Prof. (Dr.) Gurpreet Singh Executive Director – Liberal Arts Director – IDOL © No part of this publication should be reproduced, stored in a retrieval system, or transmitted in any form or by any means, electronic, mechanical, photocopying, recording and/or otherwise without the prior written permission of the authors and the publisher. SLM SPECIALLY PREPARED FOR CU IDOL STUDENTS Printed and Published by: TeamLease Edtech Limited www.teamleaseedtech.com CONTACT NO:- 01133002345 For: CHANDIGARH UNIVERSITY Institute of Distance and Online Learning
PYTHON PROGRAMMING (Practical) Course Code: MCA656 Credit: 1 Course Objectives: • This course is designed to provide a comprehensive introduction to python programming and its various concepts. • Student will work with simple Python programs and to develop Python programs with conditionals and loops. • Student will learn to store and retrieve data from database using Python program and to develop Python Graphical User Interface. PRACTICAL 1: WRITE A PROGRAM TO PRINT TWIN PRIMES LESS THAN 1000. IF TWO CONSECUTIVE ODD NUMBERS ARE BOTH PRIME THEN THEY ARE KNOWN AS TWIN PRIMES. 1. Program: defis_prime(n): for i in range(2, n): if n % i == 0: return False return True
defgenerate_twins(start, end): for i in range(start, end): j=i+2 if(is_prime(i) and is_prime(j)): print(\"{:d} and {:d}\".format(i, j)) generate_twins(2, 1000) Output: 3 and 5 5 and 7 11 and 13 17 and 19 29 and 31 41 and 43 59 and 61 71 and 73 101 and 103 107 and 109 137 and 139 149 and 151 179 and 181 191 and 193
197 and 199 227 and 229 239 and 241 269 and 271 281 and 283 311 and 313 347 and 349 419 and 421 431 and 433 461 and 463 521 and 523 569 and 571 599 and 601 617 and 619 641 and 643 659 and 661 809 and 811 821 and 823 827 and 829 857 and 859 881 and 883
PRACTICAL 2: WRITE A PROGRAM TO IMPLEMENT THESE FORMULAE OF PERMUTATIONS AND COMBINATIONS. NUMBER OF PERMUTATIONS OF N OBJECTS TAKEN R AT A TIME: P (N, R) = N! / (N-R)! NUMBER OF COMBINATIONS OF N OBJECTS TAKEN R AT A TIME IS: C (N, R) = N! / (R!*(N-R)!) = P (N, R) / R! 1. Program: print(\"Enter the Value of n: \") n = int(input()) print(\"Enter the Value of r: \") r = int(input()) fact = 1 i=1 while i<=n: fact = i*fact i = i+1 numerator = fact # n!
sub = n - r # (n-r) fact = 1 i=1 while i<=sub: fact = i*fact i = i+1 denominator = fact # (n-r)! perm = numerator/denominator print(\"\\nPermutation =\", perm) OutPut: Enter the Value of n: 2 Enter the Value of r: 5 Permutation = 2.0
PRACTICAL 3: TWO DIFFERENT NUMBERS ARE CALLED AMICABLE NUMBERS IF THE SUM OF THE PROPER DIVISORS OF EACH IS EQUAL TO THE OTHER NUMBER. FOR EXAMPLE 220 AND 284 ARE AMICABLE NUMBERS. SUM OF PROPER DIVISORS OF 220 = 1+2+4+5+10+11+20+22+44+55+110 = 284 SUM OF PROPER DIVISORS OF 284 = 1+2+4+71+142 = 220 WRITE A FUNCTION TO PRINT PAIRS OF AMICABLE NUMBERS IN A RANGE. 1. Program: # Python3 program to count # amicable pairs in an array # Calculate the sum # of proper divisors defsumOfDiv(x): sum = 1 for i in range(2, x): if x % i == 0: sum += i
return sum # Check if pair is amicable defisAmicable(a, b): if sumOfDiv(a) == b and sumOfDiv(b) == a: return True else: return False # This function prints pair # of amicable pairs present # in the input array defcountPairs(arr, n): count = 0 for i in range(0, n): for j in range(i + 1, n): if isAmicable(arr[i], arr[j]): count = count + 1 return count # Driver Code arr1 = [220, 284, 1184,
1210, 2, 5] n1 = len(arr1) print(countPairs(arr1, n1)) arr2 = [2620, 2924, 5020, 5564, 6232, 6368] n2 = len(arr2) print(countPairs(arr2, n2)) Output: 2 3
PRACTICAL 4: WAP TO GET USER ID, USER NAME, AND USER AGE FROM USER AND BASED ON THE ENTERED ID PRINT THE DETAILS OF PARTICULAR USER. HINT: US DICTIONARY. 1. Program: defpersonal_details(): name, age = \"Simron\", 19 address = \"Bangalore, Karnataka, India\" print(\"Name: {}\\nAge: {}\\nAddress: {}\".format(name, age, address)) personal_details() Output: Name: Simron Age: 19 Address: Bangalore, Karnataka, India
PRACTICAL 5: WRITE A PROGRAM TO PERFORM VARIOUS OPERATIONS ON TUPLES SUCH AS ADDING TUPLE, REPLACING TUPLE, SLICING TUPLE AND DELETING TUPLE. 1. Program: # Different types of tuples # Empty tuple my_tuple = () print(my_tuple) # Tuple having integers my_tuple = (1, 2, 3) print(my_tuple) # tuple with mixed datatypes my_tuple = (1, \"Hello\", 3.4) print(my_tuple) # nested tuple
my_tuple = (\"mouse\", [8, 4, 6], (1, 2, 3)) print(my_tuple) Output () (1, 2, 3) (1, 'Hello', 3.4) ('mouse', [8, 4, 6], (1, 2, 3)) Tuple is a collection of Python objects much like a list. The sequence of values stored in a tuple can be of any type, and they are indexed by integers. Values of a tuple are syntactically separated by ‘commas’. Although it is not necessary, it is more common to define a tuple by closing the sequence of values in parentheses. This helps in understanding the Python tuples more easily. Creating a Tuple In Python, tuples are created by placing a sequence of values separated by ‘comma’ with or without the use of parentheses for grouping the data sequence. Write for Python program to demonstrate the addition of elements in a Tuple. Program: #Creating an empty Tuple Tuple1 = () print(\"Initial empty Tuple: \") print (Tuple1) #Creatting a Tuple
#with the use of string Tuple1 = ('Geeks', 'For') print(\"\\nTuple with the use of String: \") print(Tuple1) # Creating a Tuple with # the use of list list1 = [1, 2, 4, 5, 6] print(\"\\nTuple using List: \") print(tuple(list1)) #Creating a Tuple #with the use of built-in function Tuple1 = tuple('Geeks') print(\"\\nTuple with the use of function: \") print(Tuple1) Program: Initial empty Tuple: () Tuple with the use of String: ('Geeks', 'For')
Tuple using List: (1, 2, 4, 5, 6) Tuple with the use of function: ('G', 'e', 'e', 'k', 's') Creating a Tuple with Mixed Datatypes. Tuples can contain any number of elements and of any datatype (like strings, integers, list, etc.). Tuples can also be created with a single element, but it is a bit tricky. Having one element in the parentheses is not sufficient, there must be a trailing ‘comma’ to make it a tuple. #Creating a Tuple #with Mixed Datatype Tuple1 = (5, 'Welcome', 7, 'Geeks') print(\"\\nTuple with Mixed Datatypes: \") print(Tuple1) #Creating a Tuple #with nested tuples Tuple1 = (0, 1, 2, 3) Tuple2 = ('python', 'geek') Tuple3 = (Tuple1, Tuple2) print(\"\\nTuple with nested tuples: \") print(Tuple3)
#Creating a Tuple #with repetition Tuple1 = ('Geeks',) * 3 print(\"\\nTuple with repetition: \") print(Tuple1) #Creating a Tuple #with the use of loop Tuple1 = ('Geeks') n=5 print(\"\\nTuple with a loop\") for i in range(int(n)): Tuple1 = (Tuple1,) print(Tuple1) Output: Tuple with Mixed Datatypes: (5, 'Welcome', 7, 'Geeks') Tuple with nested tuples: ((0, 1, 2, 3), ('python', 'geek'))
Tuple with repetition: ('Geeks', 'Geeks', 'Geeks') Tuple with a loop ('Geeks',) (('Geeks',),) ((('Geeks',),),) (((('Geeks',),),),) ((((('Geeks',),),),),) Accessing of Tuples Tuples are immutable, and usually, they contain a sequence of heterogeneous elements that are accessed via unpacking or indexing (or even by attribute in the case of named tuples). Lists are mutable, and their elements are usually homogeneous and are accessed by iterating over the list. Note: In unpacking of tuple number of variables on the left-hand side should be equal to a number of values in given tuple a. #Accessing Tuple #with Indexing Tuple1 = tuple(\"Geeks\") print(\"\\nFirst element of Tuple: \") print(Tuple1[1])
#Tuple unpacking Tuple1 = (\"Geeks\", \"For\", \"Geeks\") #This line unpack #values of Tuple1 a, b, c = Tuple1 print(\"\\nValues after unpacking: \") print(a) print(b) print(c) Output: First element of Tuple: e Values after unpacking: Geeks For Geeks Concatenation of Tuples Concatenation of tuple is the process of joining two or more Tuples. Concatenation is done by the use of ‘+’ operator. Concatenation of tuples is done always from the end of the original tuple. Other arithmetic operations do not apply on Tuples.
# Concatenaton of tuples Tuple1 = (0, 1, 2, 3) Tuple2 = ('Geeks', 'For', 'Geeks') Tuple3 = Tuple1 + Tuple2 # Printing first Tuple print(\"Tuple 1: \") print(Tuple1) # Printing Second Tuple print(\"\\nTuple2: \") print(Tuple2) # Printing Final Tuple print(\"\\nTuples after Concatenaton: \") print(Tuple3) Output: Tuple 1:
(0, 1, 2, 3) Tuple2: ('Geeks', 'For', 'Geeks') Tuples after Concatenaton: (0, 1, 2, 3, 'Geeks', 'For', 'Geeks') Slicing of Tuple Slicing of a Tuple is done to fetch a specific range or slice of sub-elements from a Tuple. Slicing can also be done to lists and arrays. Indexing in a list results to fetching a single element whereas Slicing allows to fetch a set of elements. # Slicing of a Tuple # Slicing of a Tuple # with Numbers Tuple1 = tuple('GEEKSFORGEEKS') # Removing First element print(\"Removal of First Element: \") print(Tuple1[1:]) # Reversing the Tuple
print(\"\\nTuple after sequence of Element is reversed: \") print(Tuple1[::-1]) # Printing elements of a Range print(\"\\nPrinting elements between Range 4-9: \") print(Tuple1[4:9]) Deleting a Tuple Tuples are immutable and hence they do not allow deletion of a part of it. The entire tuple gets deleted by the use of del () method. Note- Printing of Tuple after deletion results in an Error. # Deleting a Tuple Tuple1 = (0, 1, 2, 3, 4) del Tuple1 print(Tuple1) Traceback (most recent call last): File “/home/efa50fd0709dec08434191f32275928a.py”, line 7, in print(Tuple1) NameError: name ‘Tuple1’ is not defined
PRACTICAL 6: IMPLEMENT A STUDENT CLASS WITH INFORMATION SUCH AS ROLLNO, NAME, CLASS. THE INFORMATION MUST BE ENTERED BY THE USER. 1. Program: # This is simplest Student data management program in python # Create class \"Student\" class Student: # Constructor def __init__(self, name, rollno, m1, m2): self.name = name self.rollno = rollno self.m1 = m1 self.m2 = m2 # Function to create and append new student def accept(self, Name, Rollno, marks1, marks2 ): # use ' int(input()) ' method to take input from user ob = Student(Name, Rollno, marks1, marks2 ) ls.append(ob)
# Function to display student details def display(self, ob): print(\"Name : \", ob.name) print(\"RollNo : \", ob.rollno) print(\"Marks1 : \", ob.m1) print(\"Marks2 : \", ob.m2) print(\"\\n\") # Search Function def search(self, rn): for i in range(ls.__len__()): if(ls[i].rollno == rn): return i # Delete Function def delete(self, rn): i = obj.search(rn) del ls[i] # Update Function def update(self, rn, No):
i = obj.search(rn) roll = No ls[i].rollno = roll; # Create a list to add Students ls =[] # an object of Student class obj = Student('', 0, 0, 0) print(\"\\nOperations used, \") print(\"\\n1.Accept Student details\\n2.Display Student Details\\n\" / / \"3.Search Details of a Student\\n4.Delete Details of Student\" / / \"\\n5.Update Student Details\\n6.Exit\") # ch = int(input(\"Enter choice:\")) # if(ch == 1): obj.accept(\"A\", 1, 100, 100) obj.accept(\"B\", 2, 90, 90) obj.accept(\"C\", 3, 80, 80) # elif(ch == 2): print(\"\\n\")
print(\"\\nList of Students\\n\") for i in range(ls.__len__()): obj.display(ls[i]) # elif(ch == 3): print(\"\\n Student Found, \") s = obj.search(2) obj.display(ls[s]) # elif(ch == 4): obj.delete(2) print(ls.__len__()) print(\"List after deletion\") for i in range(ls.__len__()): obj.display(ls[i]) # elif(ch == 5): obj.update(3, 2) print(ls.__len__()) print(\"List after updation\") for i in range(ls.__len__()): obj.display(ls[i])
# else: print(\"Thank You !\") Output: Operations used, 1.Accept Student details 2.Display Student Details 3.Search Details of a Student 4.Delete Details of Student 5.Update Student Details 6.Exit List of Students Name : A RollNo : 1 Marks1 : 100 Marks2 : 100 Name : B RollNo : 2
Marks1 : 90 Marks2 : 90 Name : C RollNo : 3 Marks1 : 80 Marks2 : 80 Student Found, Name : B RollNo : 2 Marks1 : 90 Marks2 : 90 2 List after deletion Name : A RollNo : 1 Marks1 : 100 Marks2 : 100
Name : C RollNo : 3 Marks1 : 80 Marks2 : 80 2 List after updation Name : A RollNo : 1 Marks1 : 100 Marks2 : 100 Name : C RollNo : 2 Marks1 : 80 Marks2 : 80 Thank You!
PRACTICAL 7: WRITE A PYTHON PROGRAM TO GENERATE A SIMPLE BAR GRAPH USING PYPLOT. THE GRAPH SHOULD BE PROPERLY LABELED. 1. Program: barplot = plt.bar(x, y) for bar in barplot: yval = bar.get_height() plt.text(bar.get_x() + bar.get_width()/2.0, yval, int(yval), va='bottom') #va: vertical alignment y positional argument plt.title(\"Simple Bar graph\") plt.xlabel('Students') plt.ylabel(\"Math Score\") Output:
PRACTICAL 8: WRITE A PYTHON PROGRAM TO GENERATE PIE-CHART USING PYPLOT. THE GRAPH SHOULD BE PROPERLY LABELED. 1. Program: share = [20, 12, 11, 4, 3] companies = ['Google', 'Facebook', 'Apple', 'Microsoft', 'IBM', ] comp = pd.DataFrame({\"share\" : share, \"companies\" : companies}) ax = comp.plot(y=\"share\", kind=\"pie\", labels = comp[\"companies\"], autopct = '%1.0f%%', legend=False, title='Market Share') # Hide y-axis label ax.set(ylabel='') Output:
PRACTICAL 9: WRITE A PYTHON PROGRAM TO PLOT THE FUNCTION Y = X2 USING THE PYPLOT OR MATPLOTLIB LIBRARIES. 1. Program: import matplotlib.pyplot as plt import numpy as np defplot_parabola(): # create 1000 equally spaced points between -10 and 10 x=np.linspace(-10,10,1000) y=x**2 fig, ax=plt.subplots() ax.plot(x.y) plot_parabola() Output:
PRACTICAL 10: WRITE A PROGRAM IN PYTHON TO COMPUTE THE GREATEST COMMON DIVISOR AND THE LEAST COMMON MULTIPLE OF TWO INTEGERS. 1. Program: defgcd(x, y): gcd = 1 if x % y == 0: return y for k in range(int(y / 2), 0, -1): if x % k == 0 and y % k == 0: gcd = k break return gcd print(gcd(12, 17)) print(gcd(4, 6)) Output:
CODE FOR LCM: # Python Program to find the L.C.M. of two input number defcompute_lcm(x, y): # choose the greater number if x > y: greater = x else: greater = y while(True): if((greater % x == 0) and (greater % y == 0)): lcm = greater
break greater += 1 return lcm num1 = 54 num2 = 24 print(\"The L.C.M. is\", compute_lcm(num1, num2)) Output The L.C.M. is 216
PRACTICAL 11: WRITE A PROGRAM IN PYTHON TO READ SORT A LIST OF INTEGER ELEMENTS USING THE BUBBLE SORT METHOD. DISPLAY THE SORTED ELEMENT ON THE SCREEN. 1. Program: defbubbleSort(nlist): for passnum in range(len(nlist)-1,0,-1): for i in range(passnum): if nlist[i]>nlist[i+1]: temp = nlist[i] nlist[i] = nlist[i+1] nlist[i+1] = temp nlist = [14,46,43,27,57,41,45,21,70] bubbleSort(nlist) print(nlist) Output: [14, 21, 27, 41, 43, 45, 46, 57, 70]
PRACTICAL 12: WRITE A PROGRAM IN PYTHON TO FIND OUT THE FREQUENCY OF EACH ELEMENT IN A LIST USING A DICTIONARY. 1. Program: # Python program to count the frequency of # elements in a list using a dictionary defCountFrequency(my_list): # Creating an empty dictionary freq = {} for item in my_list: if (item in freq): freq[item] += 1 else: freq[item] = 1 for key, value in freq.items(): print (\"% d : % d\"%(key, value))
# Driver function if __name__ == \"__main__\": my_list =[1, 1, 1, 5, 5, 3, 1, 3, 3, 1, 4, 4, 4, 2, 2, 2, 2] CountFrequency(my_list) Output: 1: 5 2: 4 3: 3 4: 3 5: 2
PRACTICAL 13: IMPLEMENT DATA ANALYSIS ON DATASET IN FOLLOWING MODULES: a. Importing the data b. Data Wrangling c. EDA d. Model Development e. Model Evaluation and Refinement 1. Program: a. Importing the data Program: import csv with open(\"E:\\\\customers.csv\",'r') as custfile: rows=csv.reader(custfile,delimiter=',') for r in rows: print(r)' OutPut: ['customerID', 'gender', 'Contract', 'PaperlessBilling', 'Churn'] ['7590-VHVEG', 'Female', 'Month-to-month', 'Yes', 'No'] ['5575-GNVDE', 'Male', 'One year', 'No', 'No'] ['3668-QPYBK', 'Male', 'Month-to-month', 'Yes', 'Yes'] ['7795-CFOCW', 'Male', 'One year', 'No', 'No'] …… b. Data Wrangling Program: The Pandas library in python provides a single function, merge, as the entry point for all standard database join operations between DataFrame objects −
pd.merge(left, right, how='inner', on=None, left_on=None, right_on=None, left_index=False, right_index=False, sort=True) Let us now create two different DataFrames and perform the merging operations on it. # import the pandas library import pandas as pd left = pd.DataFrame({ 'id':[1,2,3,4,5], 'Name': ['Alex', 'Amy', 'Allen', 'Alice', 'Ayoung'], 'subject_id':['sub1','sub2','sub4','sub6','sub5']}) right = pd.DataFrame( {'id':[1,2,3,4,5], 'Name': ['Billy', 'Brian', 'Bran', 'Bryce', 'Betty'], 'subject_id':['sub2','sub4','sub3','sub6','sub5']}) print left print right Its output is as follows − Name id subject_id 0 Alex 1 sub1 1 Amy 2 sub2 2 Allen 3 sub4 3 Alice 4 sub6 4 Ayoung 5 sub5 Name id subject_id 0 Billy 1 sub2 1 Brian 2 sub4 2 Bran 3 sub3 3 Bryce 4 sub6 4 Betty 5 sub5 c. EDA Program: import pandas as pd Df = pd.read_csv(\"https://vincentarelbundock.github.io / Rdatasets / csv / car / Chile.csv\")
Output: d. Model Development Program: import numpy as np import matplotlib.pyplot as plt from IPython.display import display from IPython.html import widgets from ipywidgets import interact, interactive, fixed, interact_manual /opt/conda/lib/python3.6/site-packages/IPython/html.py:14: ShimWarning: The `IPython.html` package has been deprecated since IPython 4.0. You should import from `notebook` instead. `IPython.html.widgets` has moved to `ipywidgets`. \"`IPython.html.widgets` has moved to `ipywidgets`.\", ShimWarning) df = pd. read_csv('../input/auto_clean.csv') df.head() Output: symboling normalized-losses make aspiration num-of-doors body- style drive-wheels engine-location wheel-base length width height curb-weight engine-type num-of-cylinders engine-size fuel-system bore stroke compression-ratio horsepower peak-rpm city-mpg highway-mpg price city-L/100km horsepower-binned diesel gas 0 3 122 alfa-romero std two convertible rwd front 88.6 0.811148 0.890278 48.8 2548 dohc four 130 mpfi 3.47 2.68 9.0 111.0 5000.0 21 27 13495.011.190476 Medium 01 1 3 122 alfa-romero std two convertible rwd front 88.6 0.811148 0.890278 48.8 2548 dohc four 130 mpfi 3.47 2.68 9.0 111.0 5000.0 21 27 16500.011.190476 Medium 01 2 1 122 alfa-romero std two hatchback rwd front 94.5 0.822681 0.909722 52.4 2823 ohcv six 152 mpfi 2.68 3.47 9.0 154.0 5000.0 19 26 16500.012.368421 Medium 01
3 2 164 audi std four sedan fwd front 99.8 0.848630 0.919444 54.3 2337 ohc four 109 mpfi 3.19 3.40 10.0 102.0 5500.0 24 30 13950.09.791667 Medium 01 4 2 164 audi std four sedan 4wd front 99.4 0.848630 0.922222 54.3 2824 ohc five 136 mpfi 3.19 3.40 8.0 115.0 5500.0 18 22 17450.013.055556 Medium 01 MODEL EVALUATION: Metrics available for various machine learning tasks are detailed in sections below. Many metrics are not given names to be used as scoring values, sometimes because they require additional parameters, such as fbeta_score. In such cases, you need to generate an appropriate scoring object. The simplest way to generate a callable object for scoring is by using make_scorer. That function converts metrics into callable that can be used for model evaluation. One typical use case is to wrap an existing metric function from the library with non-default values for its parameters, such as the beta parameter for the fbeta_score function: >>> >>> from sklearn.metrics import fbeta_score, make_scorer >>>ftwo_scorer = make_scorer(fbeta_score, beta=2) >>> from sklearn.model_selection import GridSearchCV >>> from sklearn.svm import LinearSVC >>> grid = GridSearchCV(LinearSVC(), param_grid={'C': [1, 10]}, ... scoring=ftwo_scorer, cv=5) The second use case is to build a completely custom scorer object from a simple python function using make_scorer, which can take several parameters: the python function you want to use (my_custom_loss_func in the example below)
whether the python function returns a score (greater_is_better=True, the default) or a loss (greater_is_better=False). If a loss, the output of the python function is negated by the scorer object, conforming to the cross validation convention that scorers return higher values for better models. For classification metrics only: whether the python function you provided requires continuous decision certainties (needs_threshold=True). The default value is False. Any additional parameters, such as beta or labels in f1_score. Here is an example of building custom scorers, and of using the greater_is_better parameter: >>> >>> import numpy as np >>>defmy_custom_loss_func(y_true, y_pred): ... diff = np.abs(y_true - y_pred).max() ... return np.log1p(diff) ... >>> # score will negate the return value of my_custom_loss_func, >>> # which will be np.log(2), 0.693, given the values for X >>> # and y defined below. >>> score = make_scorer(my_custom_loss_func, greater_is_better=False) >>> X = [[1], [1]] >>> y = [0, 1] >>> from sklearn.dummy import DummyClassifier >>>clf = DummyClassifier(strategy='most_frequent', random_state=0)
>>>clf = clf.fit(X, y) >>>my_custom_loss_func(y, clf.predict(X)) 0.69... >>> score(clf, X, y) -0.69... 3.3.1.3. Implementing your own scoring object You can generate even more flexible model scorers by constructing your own scoring object from scratch, without using the make_scorer factory. For a callable to be a scorer, it needs to meet the protocol specified by the following two rules: It can be called with parameters (estimator, X, y), where estimator is the model that should be evaluated, X is validation data, and y is the ground truth target for X (in the supervised case) or None (in the unsupervised case). It returns a floating point number that quantifies the estimator prediction quality on X, with reference to y. Again, by convention higher numbers are better, so if your scorer returns loss, that value should be negated. Note- Using custom scorers in functions where n_jobs> 1 While defining the custom scoring function alongside the calling function should work out of the box with the default joblib backend (loky), importing it from another module will be a more robust approach and work independently of the joblib backend. For example, to use n_jobs greater than 1 in the example below, custom_scoring_function function is saved in a user-created module (custom_scorer_module.py) and imported: >>> >>> from custom_scorer_module import custom_scoring_function >>>cross_val_score(model,
... X_train, ... y_train, ... scoring=make_scorer(custom_scoring_function, greater_is_better=False), ... cv=5, ... n_jobs=-1) 3.3.1.4. Using multiple metric evaluation Scikit-learn also permit evaluation of multiple metrics in GridSearchCV, RandomizedSearchCV and cross_validate. There are three ways to specify multiple scoring metrics for the scoring parameter: As an iterable of string metrics:: >>> >>> scoring = ['accuracy', 'precision'] As a dict mapping the scorer name to the scoring function:: >>> >>> from sklearn.metrics import accuracy_score >>> from sklearn.metrics import make_scorer >>> scoring = {'accuracy': make_scorer(accuracy_score), ... 'prec': 'precision'} Note - that the dict values can either be scorer functions or one of the predefined metric strings. As a callable that returns a dictionary of scores:
>>> >>> from sklearn.model_selection import cross_validate >>> from sklearn.metrics import confusion_matrix >>> # A sample toy binary classification dataset >>> X, y = datasets.make_classification(n_classes=2, random_state=0) >>>svm = LinearSVC(random_state=0) >>>defconfusion_matrix_scorer(clf, X, y): ... y_pred = clf.predict(X) ... cm = confusion_matrix(y, y_pred) ... return {'tn': cm[0, 0], 'fp': cm[0, 1], ... 'fn': cm[1, 0], 'tp': cm[1, 1]} >>>cv_results = cross_validate(svm, X, y, cv=5, ... scoring=confusion_matrix_scorer) >>> # Getting the test set true positive scores >>> print(cv_results['test_tp']) [10 9 8 7 8] >>> # Getting the test set false negative scores >>> print(cv_results['test_fn']) [0 1 2 3 2]
PRACTICAL 14: IMPLEMENT DATA VISUALIZATION ON DATASET IN FOLLOWING MODULES: a. Histogram b. Column Chart c. Box plot chart d. Pie chart e. Scatter plot 1. Program: Histogram: The histogram represents the frequency of occurrence of specific phenomena which lie within a specific range of values and arranged in consecutive and fixed intervals. In below code histogram is plotted for Age, Income, Sales. So these plots in the output shows frequency of each unique value for each attribute. # import pandas and matplotlib import pandas as pd import matplotlib.pyplot as plt # create 2D array of table given above data = [['E001', 'M', 34, 123, 'Normal', 350],
['E002', 'F', 40, 114, 'Overweight', 450], ['E003', 'F', 37, 135, 'Obesity', 169], ['E004', 'M', 30, 139, 'Underweight', 189], ['E005', 'F', 44, 117, 'Underweight', 183], ['E006', 'M', 36, 121, 'Normal', 80], ['E007', 'M', 32, 133, 'Obesity', 166], ['E008', 'F', 26, 140, 'Normal', 120], ['E009', 'M', 32, 133, 'Normal', 75], ['E010', 'M', 36, 133, 'Underweight', 40] ] # dataframe created with # the above data array df = pd.DataFrame(data, columns = ['EMPID', 'Gender', 'Age', 'Sales', 'BMI', 'Income'] ) # create histogram for numeric data df.hist() # show plot plt.show() Output :
2. Column Chart : A column chart is used to show a comparison among different attributes, or it can show a comparison of items over time. # Dataframe of previous code is used here # Plot the bar chart for numeric values # a comparison will be shown between # all 3 age, income, sales df.plot.bar() # plot between 2 attributes plt.bar(df['Age'], df['Sales']) plt.xlabel(\"Age\") plt.ylabel(\"Sales\") plt.show() Output :
3. Box plot chart : A box plot is a graphical representation of statistical data based on the minimum, first quartile, median, third quartile, and maximum. The term “box plot” comes from the fact that the graph looks like a rectangle with lines extending from the top and bottom. Because of the extending lines, this type of graph is sometimes called a box-and-whisker plot. For quantile and median refer to this Quantile and median. # For each numeric attribute of dataframe df.plot.box() # individual attribute box plot plt.boxplot(df['Income']) plt.show() Output :
4. Pie Chart : A pie chart shows a static number and how categories represent part of a whole the composition of something. A pie chart represents numbers in percentages, and the total sum of all segments needs to equal 100%. plt.pie(df['Age'], labels = {\"A\", \"B\", \"C\", \"D\", \"E\", \"F\", \"G\", \"H\", \"I\", \"J\"}, autopct ='% 1.1f %%', shadow = True) plt.show() plt.pie(df['Income'], labels = {\"A\", \"B\", \"C\", \"D\", \"E\", \"F\
Search