Workshop 3: Learn Data Science: How Python Logic Works: conditional statements, logic statements, loops and functions
In this workshop we start to introduce conditional statements, logical statements (if, else and elif) and loops (while and for) and functions.
These act as the foundations for developing more complicated and advanced workflows that you often see in Data Science and Sofware Development applications.
For this we will take a closer look at how to use and initiate conditional statements in Python, loops and how to define function, covering a thorough introduction into these topics. Most of these techniques, building on top of the previous workshops in fundamentals and sequences, act as the building blocks for any Python programming application that you will encounter.
These act as key foundations for most programming applications and their logic appear in many of the software tools or data science projects that you will interact with at some point. This includes being a key part of many of the foundations of the tools that you will encounter through your time as a budding data scientists.
If it useful to pay attention to how these are used, created and implemented so that you will be able to recognise them in the wild and why they may be used:
Python as a language suppurts the usual logical conditions from Mathematics which includes:
Equals: a==b Not Equals: a!=b Less than: a<b Less than or equal to: a<=b Greater than: a>b Greater than or equal to: a>=b
These conditions allow you to check whether something is true or false within the data. For example, we can check whether the value assigned to one value is equal to something else:
#assign a value to x x = 3 #print the value print(x) #check whether x is equal to 4 print(x == 4) #what about whether x is equal to 3 print(x == 3)
A key distinction to be made here is that the single equals sign is used for assignment of a varible (i.e. storing something in the named variable) while the double equals sign is used to check whether one value is equal to another.
You can try checking the results of different conditionals by using the print statement which will return True or False:
# check different conditions print(x<= 1) print(x >=1) print(x!=0) print(x>5) print(x % 2 == 0) print("A" in "Alpha") print("B" not in "Alpha") # Python will return True or False
The value returned here from these conditions is called a Boolean. This is a value type that can only take the values of either
False and is used primarily to indicate whether a condition is met. We can check this as follows:
As we can see, we get the returned value of a class type
bool which is short for Boolean.
What this means is that we can use these conditions in If, Else and Elif statements that are used to be able to run certain pieces of code if certain conditions are met:
If, Else and Elif
Python conditionals follow the same logic as usual english condition. For example:
“If I had money I would get a BMW M2, otherwise I would but a bike”
This could be rewritten as:
“If I had money THEN I would buy a BMW M2, ELSE I would buy a bike”
Python’s IF structure follows this same scheme.
Thus we can create the following example:
# assign a and b values a = 10 b = 10 # add a condition if a == b: # tell Python what to do if the condition is met print("a equals b")
We can see here that the condition is met and consequently the print statement is ran.
A key point here is the use of indentation (the white space at the beginning of a line) to define what code to run if that condition is met. This is often taken as four
spaces or one
tab which is used to tell the interpreter what code to run based on the conditional statement. Another key point here also is the use of
: which is used to tell the program that this is the end of the conditional, after this point expect indented code. Without these things then the code would not run.
The alternative here is what if the condition is not met:
#chenge the values so the condition is no longer met a = 10 b = 20 # implement the original condition if a == b: # tell Python what to do if the condition is met print("a equals b")
We can see here that the condition not being met has meant that the code that would have otherwise run, did not run.
What if instead of running nothing, we wanted the code to output the fact that “a does not equal b”. We can do this with an
else statement as follows:
# assign a and b values a = 10 b = 20 # add a condition if a == b: # tell Python what to do if the condition is met print("a equals b") # if a is different from b else: print("a does not equal b") # this message will be printed
We can see now that instead of no code running, if the condition is not met, then the code following the
else statement i run instead.
This simply means that if the first condition is not met, then regardless of anything else happening, the second block of code will run instead.
It is important to note for this that the
: and indentation is still required to indicate that this is a block of code that is to be run if the initial condition is not met. If we did not use this notation then the second code block would have run regardless:
# assign a and b values a = 10 b = 10 # add a condition if a == b: # tell Python what to do if the condition is met print("a equals b") print("a does not equal b") # this message will be printed
We can see here what happens if the
else statement and other notation is not used, we get code running when we don’t want it to.
Another aspect of this is what if we want to build in more information into our conditional. What is instead of just speciying whether
a equals b we wanted to say whether
a is larger than b or
a is smaller than b. We can do this bu ising an
elif statement. This acts as another if statement essentially and is equivalent to
else if (it is just written as a shortcut as us programmers are lazy. This then allows us to build in more information and more options to our if statement:
# assign a and b values a = 11 b = 10 # add a condition if a == b: # tell Python what to do if the condition is met print("a equals b") # if a is different from b and smaller elif a < b: print("a is smaller than b") else: print("a is not equal to b nor smaller") # this message will be printed
We can see here that we now have built in more functionality into our simple conditional statement. This allows more complicated conditionals to be undertaken by the code if multiple different actions are undertaken. For example in a game where there are many different avenues to pursue, you can use a multi stage conditional to implement different results in response to those different choices.
Logical operators – not / and / or
We can also make these more complicated by building in the logical operators of
or. What these mean that in terms of conditionals:
- and returns True if both statements are true;
- or returns True if one of the statements is true;
- not reverse the result, returns False if the result is true;
Using these logical operators allows the conditional to check for multiple conditionals. We can look at the following examples and change the assigned values for a, b and c to see the different outcomes:
# assign values to a, b and c a = 1 b = 2 c = 3 # check different conditions if a > b and b > c: print("Both conditions are True.") if a > b or a > c: print("At least one of the conditions is True.") if not(a % 2 == 0): # If a is NOT even, it will be printed. print(a) if a > b and a > c and b > c: print("All conditions are true.") if a % 2 == 0 or b % 2 == 0 and c > 0: # Be careful which command will be evaluated first. print("a or b is even and c is positive.") if not(a > 100) and b < 3: print("Both conditions are true.") if not(a + b < 10) or b < 0: print("At least one of the conditions is true.") # if not(a + b > 10 and c > 0): Will this condition print anything? # print ("")
From this it is important to note the precendence of different logical operators follows:
Which is also important to be born in mind
The nested if
It is possible to also have multiple if statements within one single if statement. This could be used instead of elif and else statements but can be used to check for deeper and deeper meaning within the data once the original condition has been passed:
# assign a value to a a = 51 b = 5 if b % 2 == 1: print ("b is odd") if a > 10: print("a is greater than 10") if a > 50: print("a is greater than 50")
Short hand statements
Finally, you can also shorten these statements to only one line if each instance only has one statement to execute. For example:
#creat the two variables a = 2 b = 3 #for a simple if statement if a > b: print("A is greater than B") #for an if, else statement print("A") if a > b else print("B")
Knowing the values for a, b and c, check to see if they can represent the lengths of a triangle. If so, mention which kind it is: equilateral, isosceles, straight-angled or scalene.
# assign values to our 'sides' a = 7 b = 6 c = 2 "First, to see if they could be sides, all must be positive." "The condition for them to represent triangle sides is that no side is longer then the sum of the other two." "this must be checked within the first <if> to be sure that you only deal with positive values" if a <= 0 or b <= 0 or c <= 0: print("They can't represent lengths.") else: if a >= b + c or b >= a + c or c >= a + b: print("It is not a triangle.") else: "Now, we know we can have a triangle, we are just trying to see what kind." if a == b == c: print("It is an equilateral triangle.") elif a == b or a == c or b == c: print("It is an isosceles triangle.") elif a**2 == b**2 + c**2 or b**2 == c**2 + a**2 or c**2 == a**2 + b**2: print("It is a right-angled triangle.") else: print("It is a scalene triangle.")
Loops are an important structure to be able to learn for automating multiple instances of the same task. For example if you wanted to perform the same operation on each item in a list you could use a loop to iterate over the list and perform that task, or if you wanted to produce a given output while a certain condition met you can also use a loop.
For this, it is important to note that there are two different types of loops:
- while loops
- for loops
Both of these have the same format as the if statement in that they both will have a colon after the condition line and the block of code to loop over will be indented.
A while loop will repeat the instructions within the indented block of code as long as the initial condition that created the lopp remains true:
# initialise the variable x = 0 # write the condition while x <= 10: "the following lines will be executed as long as x is smaller or equal than 10" print (x) "however, if there are no changes made to our variable, the loop will continue infinetely" "so, in this case, we will increase x by one" x = x + 1
As I mentioned, if there are no changes made to the variable, the loop will never end. This is what we would call an Infinite Loop.
#DO NOT TRY RUNNING THIS CODE #x = 0 #while True: # print (x)
To exit an infinite loop you can use the break command.
# initialise x x = 0 # start the loop while True: print(x) # when reaching the break command, the code stops break
This command can also stop a loop that isn’t infinite if an alternative condition is met.
# initialise x x = 0 # start the loop while x < 100: print(x) # add a condition if x == 5: # if the condition is met, stop the code break # if the condition isn't met, x will increase and the loop will continue x += 1
To skip over a certain iteration of the while loop you could also use the
#initialise x x = 0 #start the loop while x < 10: #add one to the value x += 1 #continue condition if x == 4: #input the continue statement continue #the output of the loop if the continue statement is not run print(x)
Now, let’s do a simple exercise. We will calculate the sum of all numbers between 1 and 10. After running the code, try changing the code so you will only find the sum of the odd numbers smaller than 10.
# initialise the variables x = 1 s = 0 # start the loop while x <= 10: # add i to our sum s = s + x # increase the variable x += 1 print(s)
A FOR loop, in contrast to the WHILE loop, doesn’t require you to increment a variable as it will do it itself. So, FOR a variable, x, in a certain list/ dictionary/ string, the variable will take every value in the list, in order. Look at the following examples:
# define a list colours = ["red", "blue", "yellow","green","orange"] # loop over the list for x in colours: #print each value in the list print(x)
You can also loop over strings
#create the for condition for x in "red": #print the result print(x)
As before for the While loop, we can also take advantage of both the
#create the for condition for x in colours: #if the color is blue if x == "blue": #then skip continue #else if the color is green elif x == "orange": #break the loop break #print the color print(x)
The range() function
A very useful way of looping over numbers is by using the Range function. There are 2 ways this can be done:
- for i in range(n), loop starts at 0 and ends at n-1
- for i in range(x,n), the loop starts at x and ends at n-1
! Remember that the range’s upper bound is exclusive.
NOTE: Python starts counting at 0!
for i in range(10): print(i)
for i in range(5,10): print(i)
Below, we will calculate the sum of the first 10 numbers (as we did using a while loop before). We can see that this time there’s no need to increment the i.In :
# initialise the sum variable s = 0 # begin the loop for i in range(1,11): # add the number to our sum s = s + i #print the result print(s)
Let’s adjust the code to calculate the sum of odd numbers only.In :
# initialise the sum variable s = 0 # begin the loop for i in range(1,11): # add the number to our sum if i % 2 == 1: s = s + i print(s)
As with conditionals, it is also possible to have nested loops within each other.In :
# defining two lists colours = ["red", "blue", "yellow","green","orange"] clothes = ["shirt", "socks"] for x in colours: # first, we'll go over the first list for y in clothes: # now, we'll go over the second one print(x, y)
Try creating an output that will display:
# start the first loop for i in range(4): "We will go up to 4 as i represents the number of the line containing '*'" # start the second loop "This loop will control the columns" for j in range(i): print('*', end = '') print()
Try modifying the code above to get different shapes:
- 3 additional lines below the existing ones, containing 4,5 and respectively 6 * on the line.
- 3 * on the first line, 2 on the second and one on the third.
Functions are extremely useful when you have a code that requires you to calculate something over and over again. You can create algorithms (for calculating sums, averages, standard deviations and more other) and call upon them when you need them. This is a way of reducing the amount of code that you actually produce.
An important acronmy in coding is DRY
Which means you shouldn’t be creating the same code over an over again as why would you want to!
A function does then by creating a block of code that runs when the function itself when it is called. You can pass data, known as parameters, to the function and it can return data as a result
Defining a function
To define a function you will need to write def function_name(input variables):
This follows the same notation as we have been using for both conditional statement and loops as above in that you need the colon and indentation.
For example, using the code we wrote for our triangle problem, we can create a function that will check any a, b, c to see what kind of triangle it is (if it can be one).
For this, it is important to name functions appropriately, as here we will call it
triangle_function as this ensures anyone using your code knows what it is and is meant to do. For python, it is convention to name functions using lowercase with a
_ to indicate differences between words so that they are easy to read as well:In :
def triangle_function(a,b,c): if a <= 0 or b <= 0 or c <= 0: print("They can't represent lengths.") else: if a >= b + c or b >= a + c or c >= a + b: print("It is not a triangle.") else: "Now, we know we can have a triangle, we are just trying to see what kind." if a == b == c: print("It is an equilateral triangle.") elif a == b or a == c or b == c: print("It is an isosceles triangle.") elif a**2 == b**2 + c**2 or b**2 == c**2 + a**2 or c**2 == a**2 + b**2: print("It is a right-angled triangle.") else: print("It is a scalene triangle.")
Calling a function
Now, running the code above produced no output. But, it created the function which we will be able to call now for any set of variables we want. Try changing the numbers in the parantheses to see if the function works.
This essentially simplifies implementing the code that instead of creating the code again an again, we can just call the function.
It is important to note here that information has been passed into the function as arguments (input variables). Arguments are specified after the function name, inside the brackets, and you can add as many arguments as you want but we need to seperate them with a comma.
In this example, we used three arguments (or parameters, whatever you want to call them!). If we try to feed the function any more or less arguments than specified in the function construction we will get an error:In :
--------------------------------------------------------------------------- TypeError Traceback (most recent call last) ~AppDataLocalTemp/ipykernel_3564/1009925093.py in <module> ----> 1 my_function(3,4) TypeError: my_function() missing 1 required positional argument: 'c'
--------------------------------------------------------------------------- TypeError Traceback (most recent call last) ~AppDataLocalTemp/ipykernel_3564/2131875176.py in <module> ----> 1 my_function(3,4,5,6) TypeError: my_function() takes 3 positional arguments but 4 were given
The error will tell you how many arguments are missing or how many extra were given so it is important to pay attention.
We can get around this by specifying deafult values to ensure that the function would run even if we did not input all the arguments, or indeed if we put none in at all:
def triangle_function(a = 3,b = 4,c =5): if a <= 0 or b <= 0 or c <= 0: print("They can't represent lengths.") else: if a >= b + c or b >= a + c or c >= a + b: print("It is not a triangle.") else: "Now, we know we can have a triangle, we are just trying to see what kind." if a == b == c: print("It is an equilateral triangle.") elif a == b or a == c or b == c: print("It is an isosceles triangle.") elif a**2 == b**2 + c**2 or b**2 == c**2 + a**2 or c**2 == a**2 + b**2: print("It is a right-angled triangle.") else: print("It is a scalene triangle.")
We can see here that the function hasn’t broken like it did previously.
We can also input variables in the incorrect order if we specify the parameter when we call the function:
triangle_function(c = 5, b = 4, a =3)
Which is important to note when there are many parameters for a function but you don’t necessarily want to specify all of them when you are happy with the default ones. For this however, it is important to note that you cannot put a non-defined parameter after a defined parameter:In :
triangle_function(b = 3, 4)
File "C:UsersphiliAppDataLocalTemp/ipykernel_3564/1484231010.py", line 1 my_function(b = 3, 4) ^ SyntaxError: positional argument follows keyword argument
triangle_function(3, b = 4, c = 5)
In creating a function, it is also important to create a doc string to be able to tell users what the function actually does. It can be used for users of the code who can’t necessary see your commentsIn :
def triangle_function(a = 3,b = 4,c =5): """Used to determine what type of triangle a triangle with sides of length a, b and c are Inputs: a, b, c: length of sides, default: 3, 4, 5 """ if a <= 0 or b <= 0 or c <= 0: print("They can't represent lengths.") else: if a >= b + c or b >= a + c or c >= a + b: print("It is not a triangle.") else: "Now, we know we can have a triangle, we are just trying to see what kind." if a == b == c: print("It is an equilateral triangle.") elif a == b or a == c or b == c: print("It is an isosceles triangle.") elif a**2 == b**2 + c**2 or b**2 == c**2 + a**2 or c**2 == a**2 + b**2: print("It is a right-angled triangle.") else: print("It is a scalene triangle.")
One liners are for really simple functions, while multi lines would specify the inputs and outputs for more complicated functions.
While our existing function mostly prints out results already, we can use functions to be able to store and return values. This means that once the calculation has been performed by the function, instead of printing, we can use the function to store the result in another variable:
#create the function def add_five(x): """adds five to the value of x""" return x + 5
#call the function b = add_five(5) #print the result print(b)
This is useful if we want to pass the output of one function as input into another function in a chain of functions based on what needs to be performed on the undelying data.
The return function will essentially stop the function from running, so if you want a function to stop if a condition is met than you can use return:
Global and Local Variables
When dealing with functions, any variable that exists only in the function in called a Local Variable. This kind of variable can be used only in the function.
Global Variables are the ones that exist outside a function. This kind of variable can be used in the function too.
It is important to be able to understand the distinction between these two and how they can or cannot be used:
Trying local variables outside the function.In :
# define a function def function(): "y is a local variable (it only exists within our function), we will give y the value 100" y = 100 # this function returns the value of x return y # calling the function function() # now, we will try to do an operation with x outside the function print(y) # because y is a local variable in the function, it won't work #the error tells us that y is not defined
--------------------------------------------------------------------------- NameError Traceback (most recent call last) ~AppDataLocalTemp/ipykernel_3564/2261376186.py in <module> 10 11 # now, we will try to do an operation with x outside the function ---> 12 print (y) # because x is a local variable in the function, it won't work NameError: name 'y' is not defined
Using a global variable in the function.
# assigning x a value as a global variable (outside the function) x = "global variable" # defining a function def function_2(): # we can use x because it's global in the function print("x is a " + x) # calling the function function_2()
changing a global variable in the function.
c = 200 #define a function def add_100(c): #alter the local variable return c+100 #call the function print(add_100(c)) #call the original variabl print(c) #c has not changed because only the local variable c has changed inside the function
The distinction between global and local variables is an important one.
While global variables can be used in the function, it is bad practice as global variables can be changed outside of the function which will affect the outcome. It is better practice to define all the varibales that you would need inside the function using the parameter values.
Local variables defined inside the function cannot be called outside the function so it is important to clearly define both local and global variables.
- Write an algorithm that calculates the product of the first 15 even numbers.
- Create a function that checks if a given number is a multiple of 4.
- Create a function that checks if a number is prime.