Programming basics for Biostatistics 6099
Basics of Python programming (part 2)
Zhiguang Huo (Caleb)
Thursday Oct 12th, 2023
Outlines
- Control flows
- Loops
- Basic string operators
- file operations
- exceptions
- datetime
- more on python data structure
Control flows
num = input("Enter a number: ")
if int(num) > 0:
print(f"{num} is positive")
- colon(:) at the end of the if line
- indent (e.g., 2 (or 4) whitespaces) before the chunk of code to be
executed
- In Python if control folow, we don’t have parentheses like in R.
Indentation is used to determine the end of the code chunk.
Indentation
- Indentation serves another purpose other than code readability
- Python treats the statements with the same indentation level
(statements with an equal number of whitespaces before them) as a single
code block.
- Commonly used indent
- 2 whitespaces
- 4 whitespaces
- 1 tab (not recommended)
- This rule of identation is used for flow control, loops,
functions etc.
if else elif
num = input("Enter a number: ")
if int(num) > 0:
print(f"{num} is positive")
else:
print(f"{num} is not positive")
num = input("Enter a number: ")
if int(num) > 0:
print("The number is positive")
elif int(num) < 0:
print("The number is negative")
else:
print("The number is zero")
if else same line
number = input("Please enter a number: ")
if int(number) % 2 == 0:
print("even")
else:
print("odd")
number = input("Please enter a number: ")
print("even") if int(number) % 2 == 0 else print("odd")
True or False conditions
## False
## False
## True
"apple" in ["apple", "orange"]
## True
## False
## True
## False
True or False conditions
- we could use < (or >) to connect a series of comparisons
a = 4
b = 6
c = 9
a < b and b < c
## True
## True
## True
match (available for python >= 3.10)
- To select one cases from multiple choices
status = 400
match status:
case 400:
print("Bad request")
case 404:
print("Not found")
case 418:
print("I'm a teapot")
case _:
print("Something's wrong with the internet")
## Bad request
for loops
words = ["cat", "dog", "gator"]
for w in words:
print(w)
## cat
## dog
## gator
words = ["cat", "dog", "gator"]
for w in words:
print(f"{w} has {len(w)} letters in it.")
## cat has 3 letters in it.
## dog has 3 letters in it.
## gator has 5 letters in it.
range() function
for i in range(3):
print(i)
## 0
## 1
## 2
- range(n) creates an iterable object
- list(iterable) convert an iterable to a list
- The range(n) is exclusive, it doesn’t include the
last number n.Â
- It creates the sequence of numbers from start to stop -1.
- For example, list(range(5)) will produce [0, 1, 2, 3, 4]
## [0, 1, 2]
## [3, 4, 5, 6]
range() function
- range with step size rather than 1.
## [3, 5, 7]
## [7, 5, 3]
words = ["cat", "dog", "gator"]
for i in range(len(words)):
print(i, words[i])
## 0 cat
## 1 dog
## 2 gator
break
for num in range(1, 10):
if num % 5 == 0:
print(f"{num} can be divided by 5")
break
print(f"{num} cannot be divided by 5")
## 1 cannot be divided by 5
## 2 cannot be divided by 5
## 3 cannot be divided by 5
## 4 cannot be divided by 5
## 5 can be divided by 5
continue
for num in range(1, 10):
if num % 5 == 0:
continue
print(f"{num} cannot be divided by 5")
## 1 cannot be divided by 5
## 2 cannot be divided by 5
## 3 cannot be divided by 5
## 4 cannot be divided by 5
## 6 cannot be divided by 5
## 7 cannot be divided by 5
## 8 cannot be divided by 5
## 9 cannot be divided by 5
pass
- In python, pass is the null statement.
- It is just a placeholder for the functionality to be added
later.
- Pass does nothing.
sequence = {'p', 'a', 's', 's'}
for val in sequence:
pass
a = 33
b = 200
if b > a:
pass
while loop
num = 1
while num<10:
if num % 5 == 0:
print(f"{num} can be divided by 5")
break
print(f"{num} cannot be divided by 5")
num+=1
## 1 cannot be divided by 5
## 2 cannot be divided by 5
## 3 cannot be divided by 5
## 4 cannot be divided by 5
## 5 can be divided by 5
num = 0
while num<10:
num+=1
if num % 5 == 0:
continue
print(f"{num} cannot be divided by 5")
## 1 cannot be divided by 5
## 2 cannot be divided by 5
## 3 cannot be divided by 5
## 4 cannot be divided by 5
## 6 cannot be divided by 5
## 7 cannot be divided by 5
## 8 cannot be divided by 5
## 9 cannot be divided by 5
Basic string operators
- find
- index
- count
- join
- split
- lower
- upper
- title
- replace
- strip
find
title = "I love programming basics for Biostatistics!"
title.find("I")
## 0
## 2
## 3
title.find("o", 4) ## starting searching index is 4
## 9
## -1
- find() and index() are identical except when not found
- find() produces -1
- index() produces an error
title.index("love")
title.index("XX")
pattern detection
title = "I love programming basics for Biostatistics!"
"love" in title
## True
## False
## False
title.endswith("computing!")
## False
title.startswith("I love")
## True
## 1
join
seq = ["1", "2", "3", "4", "5"]
sep = "+"
sep.join(seq)
## '1+2+3+4+5'
## '12345'
dirs =( "", "usr", "bin", "env")
"/".join(dirs)
## '/usr/bin/env'
sep = "+"
print("C:" + "\\".join(dirs)) ## single \ has special meaning: treating special symbol as regular symbol
## C:\usr\bin\env
split
- reverse operator of join.
longSeq = "1+2+3+4+5"
longSeq.split("+")
## ['1', '2', '3', '4', '5']
## ['1+2+', '+4+5']
"Using the default value".split()
## ['Using', 'the', 'default', 'value']
lower, upper, title
sentence = "I like programming basics for Biostatistics!"
sentence.lower()
## 'i like programming basics for biostatistics!'
## 'I LIKE PROGRAMMING BASICS FOR BIOSTATISTICS!'
## 'I Like Programming Basics For Biostatistics!'
sentence.islower()
sentence.isupper()
sentence.istitle()
strip
- removes any leading (whitespace at the beginning) and trailing
(whitespace at the end) characters.
- whitespace is the default leading character to remove
- internal whitespace is kept
a = " internal whitespace is kept "
a.strip()
## 'internal whitespace is kept'
b = "*** SPAM * for * everyone!!! ***"
b.strip(" *!")
## 'SPAM * for * everyone'
c = "\na\nb\n\n\nc\n\n"
c.strip()
## 'a\nb\n\n\nc'
replace
a = "This is a cat!"
a.replace("This", "That")
## 'That is a cat!'
## 'Theez eez a cat!'
file operation (read)
https://caleb-huo.github.io/teaching/data/misc/my_file.txt
- open file, display, and close (release memory)
file = open("my_file.txt")
contents = file.read()
print(contents)
## Hello, my name is Caleb. Hello World!
## I like computing
- Alternative approach without closing step
with open("my_file.txt") as file:
contents = file.read()
print(contents)
## Hello, my name is Caleb. Hello World!
## I like computing
file operation (read)
- readlines()
- read multilple lines,
- save the result in a list
- each element of the list contains a line
myfile = "my_file.txt"
with open(myfile) as file:
lines = file.readlines()
for aline in lines:
print(aline.strip())
## Hello, my name is Caleb. Hello World!
## I like computing
file operation (write)
- write to file (overwrite original file)
with open("new_file.txt", mode="w") as file:
file.write("I like python!")
## 14
- append to file (append at the end of the original file)
with open("new_file.txt", mode="a") as file:
file.write("We like python!")
## 15
Exceptions
with open("a_file.txt") as file:
file.read()
fruit_list = ["Apple", "Banana", "Pear"]
fruit_list[3]
text = "abc"
print(text + 5)
raise TypeError("This is an error that I made up!")
Handle exceptions
- The errors (exceptions) are handled by except.
- The program will keep executing.
try:
file = open("a_file.txt")
print(1 + "2")
except FileNotFoundError:
print("Catch FileNotFoundError")
except TypeError as error_message:
print(f"Here is the error: {error_message}.")
else:
content = file.read()
print(content)
finally: ## will happen no matter what happens
file.close()
print("File was closed.")
## Catch FileNotFoundError
## File was closed.
datetime
- The datetime module supplies classes for manipulating dates and
times.
import datetime as dt
now = dt.date.today() ## date only
now.year
## 2023
## 10
## 12
birthday = dt.date(1995, 7, 31)
age = now - birthday
age.days
## 10300
datetime
now = dt.datetime.now()
now.year
## 2023
## 10
## 12
## 13
## 18
## 59
## 756676
## 3
datetime
now = dt.datetime.now()
print(f'{now:%Y-%m-%d %H:%M}')
## 2023-10-12 13:18
more on python data structure
list: creation and assignment
- create a list from a string
## ['H', 'e', 'l', 'l', 'o']
## [1, 1, 1]
## [1, 0, 3]
list: deletion and slice assignment
names = ["Alice", "Beth", "Carl", "Dan", "Emily"]
names
## ['Alice', 'Beth', 'Carl', 'Dan', 'Emily']
## ['Alice', 'Beth', 'Dan', 'Emily']
names = list("Lucas")
names[3:] = list("ky")
names
## ['L', 'u', 'c', 'k', 'y']
## 'Lucky'
- slice assignment can be unequal length
names = list("Lucas")
names[1:] = list("emonade")
"".join(names)
## 'Lemonade'
- slice assignment can be used as insertion or deletion
numbers = [1, 5]
numbers[1:1] = [2, 3, 4]
numbers
## [1, 2, 3, 4, 5]
numbers = list(range(1,6))
numbers
## [1, 2, 3, 4, 5]
numbers[1:4] = []
numbers
## [1, 5]
list: append and count
alist = [0,1,2]
alist.append(3)
alist
## [0, 1, 2, 3]
asentence = "to be or not to be"
alist = asentence.split()
alist.count("to")
## 2
x = [[1,2], 1, 2, 1, [2, 1, [1,2]]]
x.count(1)
## 2
## 1
list: extend
- extend (recommended for efficiency and readibility)
a = [0,1,2]; b = [3,4,5]
a.extend(b)
a
## [0, 1, 2, 3, 4, 5]
a = [0,1,2]; b = [3,4,5]
a + b
## [0, 1, 2, 3, 4, 5]
## [0, 1, 2]
a = [0,1,2]; b = [3,4,5]
a[len(a):] = b
a
## [0, 1, 2, 3, 4, 5]
list: index
- index: will return the first match
asentence = "to be or not to be"
alist = asentence.split()
alist
## ['to', 'be', 'or', 'not', 'to', 'be']
## 0
## 3
## 'not'
alist.index("XX")
list: insert
alist = [1,2,3,5,6]
alist.insert(3, "four")
alist
## [1, 2, 3, 'four', 5, 6]
- insert with slice assignment
alist = [1,2,3,5,6]
alist[3:3] = ["four"]
alist
## [1, 2, 3, 'four', 5, 6]
list: pop
- pop: return the last element of a list
x = list(range(10))
x.pop()
## 9
## [0, 1, 2, 3, 4, 5, 6, 7, 8]
## 8
## [0, 1, 2, 3, 4, 5, 6, 7]
list: remove
asentence = "to be or not to be"
alist = asentence.split()
alist
## ['to', 'be', 'or', 'not', 'to', 'be']
## ['be', 'or', 'not', 'to', 'be']
alist.remove("XX")
- compare pop and remove
- remove has no return value, and remove the first appearance of
certain value
- pop has return value, and pop up the last element of a list
list: reverse and sort
x = ["a", "b", "c"]
x.reverse()
x
## ['c', 'b', 'a']
- sort: sort method has no return value (in-place operator)
## [3, 4, 5]
y = ["b", "c", "a"]
y.sort()
y
## ['a', 'b', 'c']
x = [5, 3, 4]
y = x.sort()
print(y)
## None
x = [5, 3, 4]
y = sorted(x)
print(y)
## [3, 4, 5]
list: sort
x = [5, 3, 4]
y = x ## x and y are pointing to the same list
y.sort()
print(x)
## [3, 4, 5]
## [3, 4, 5]
x = [5, 3, 4]
y = x[:] ## y is a slice assignment of x, thus a new variable
y.sort()
print(x)
## [5, 3, 4]
## [3, 4, 5]
references and values
list: sort
x = ["aaa", "bb", "cccc"]
x.sort(key = len)
x
## ['bb', 'aaa', 'cccc']
x = [5, 3, 4]
x.sort(reverse = True)
print(x)
## [5, 4, 3]
dictionary: basic operator
phonebook = {"Alice": 2341,
"Beth": 4971,
"Carl": 9401
}
phonebook
## {'Alice': 2341, 'Beth': 4971, 'Carl': 9401}
## 3
## 4971
dictionary: update and delete
phonebook["Alice"] = 1358
phonebook
## {'Alice': 1358, 'Beth': 4971, 'Carl': 9401}
adict = {"Alice": 9572}
phonebook.update(adict)
phonebook
## {'Alice': 9572, 'Beth': 4971, 'Carl': 9401}
del phonebook["Carl"]
"Beth" in phonebook
## True
dictionary: clear
d = {}
d['name'] = "Amy"
d['age'] = 24
d
## {'name': 'Amy', 'age': 24}
## {}
why clear is useful
x = {}
y = x
x['key'] = 'value'
y
## {'key': 'value'}
x = {} ## now x points to a new value {}
y ## y points to the original value {'key': 'value'}
## {'key': 'value'}
x = {}
y = x
x['key'] = 'value'
y
## {'key': 'value'}
x.clear() ## clear the value x points to
y ## y still points to what x points to
## {}
references and values (part 2)
copy
- shallow copy
- only the reference address of the object is copied
d = {}
d['username'] = "admin"
d['machines'] = ["foo", "bar"]
d
## {'username': 'admin', 'machines': ['foo', 'bar']}
c = d.copy()
c['username'] = "Alex" ## c['username'] points to a new value
print(c)
## {'username': 'Alex', 'machines': ['foo', 'bar']}
## {'username': 'admin', 'machines': ['foo', 'bar']}
c['machines'].remove("bar") ## references don't change, the underlying values are changed.
print(c)
## {'username': 'Alex', 'machines': ['foo']}
## {'username': 'admin', 'machines': ['foo']}
references and values (shallow copy)
copy
- deep copy:
- will make a new copy of the values
from copy import deepcopy
d = {}
d['username'] = "admin"
d['machines'] = ["foo", "bar"]
d
## {'username': 'admin', 'machines': ['foo', 'bar']}
c = d.copy()
dc = deepcopy(d)
d['machines'].remove("bar")
print(c)
## {'username': 'admin', 'machines': ['foo']}
## {'username': 'admin', 'machines': ['foo', 'bar']}
references and values (deep copy)
dictionary initialization: fromkeys
- create keys for an empty dictionary.
{}.fromkeys(["name", "age"])
## {'name': None, 'age': None}
- create keys for a dictionary
dict.fromkeys(["name", "age"])
## {'name': None, 'age': None}
dict.fromkeys(["name", "age"], "unknown")
## {'name': 'unknown', 'age': 'unknown'}
dictionary: get
- get method is more flexible
- get is the same as indexing by keys when the key exists
d = {"name": "Amy", "age": 24}
d["name"]
## 'Amy'
## 'Amy'
- get will return None when the key doesn’t exist
d["XX"]
d.get("XX")
d.get("XX", "No exist") ## set your own return value for get
dictionary: items
- items() return all items of the dictionary
phonebook = {"Alice": 2341,
"Beth": 4971,
"Carl": 9401
}
phonebook
## {'Alice': 2341, 'Beth': 4971, 'Carl': 9401}
phonebook.items() ## this is an iterable
## dict_items([('Alice', 2341), ('Beth', 4971), ('Carl', 9401)])
## [('Alice', 2341), ('Beth', 4971), ('Carl', 9401)]
dictionary: loops
- can be used for looping a dictionary
it = phonebook.items()
for key, value in it:
print(key + "--> " + str(value))
## Alice--> 2341
## Beth--> 4971
## Carl--> 9401
- if you only want the value, not the keys
it = phonebook.items()
for _, value in it:
print(str(value))
## 2341
## 4971
## 9401
- use key to iterate a dictionary for a loop
for key in phonebook:
print(key + "--> " + str(phonebook[key]))
## Alice--> 2341
## Beth--> 4971
## Carl--> 9401
phonebook.values() ## this is an iterable
## dict_values([2341, 4971, 9401])
## [2341, 4971, 9401]
for i in phonebook.values():
print(i)
## 2341
## 4971
## 9401
dictionary: pop and popitem
phonebook = {"Alice": 2341,
"Beth": 4971,
"Carl": 9401
}
phonebook.pop("Alice")
## 2341
## {'Beth': 4971, 'Carl': 9401}
- popitem(): pop up the last item
phonebook = {"Alice": 2341,
"Beth": 4971,
"Carl": 9401
}
phonebook.popitem()
## ('Carl', 9401)
## {'Alice': 2341, 'Beth': 4971}
tuple: review basics
atuple = (0,1,2)
atuple += (3,4,5)
atuple
## (0, 1, 2, 3, 4, 5)
btuple = (0, 1, 1, ['I', 'like', 'python'])
btuple[3][0] = 'You'
print(btuple)
## (0, 1, 1, ['You', 'like', 'python'])
## 2
print(btuple.index(['You', "like", 'python']))
## 3