%matplotlib inline
from matplotlib import pyplot as plt
import pandas as pd
import numpy as np
SMALL_SIZE = 16
MEDIUM_SIZE = 16
BIGGER_SIZE = 20
plt.rc('font', size=SMALL_SIZE) # controls default text sizes
plt.rc('axes', titlesize=SMALL_SIZE) # fontsize of the axes title
plt.rc('axes', labelsize=MEDIUM_SIZE) # fontsize of the x and y labels
plt.rc('xtick', labelsize=SMALL_SIZE) # fontsize of the tick labels
plt.rc('ytick', labelsize=SMALL_SIZE) # fontsize of the tick labels
plt.rc('legend', fontsize=SMALL_SIZE) # legend fontsize
plt.rc('figure', titlesize=BIGGER_SIZE) # fontsize of the figure title
How I'll solicit feedback from you:
If you have a question in class:
#in_class
. Other students: feel free to react to other's questions. Predict what this code will do.
x = 5
print(x > 2)
True
What does x < y
do? What does it spit out?
Change the following cell so it prints False
x = 20
print (x < 2)
False
Write code that will only print x
if x is more than 100.
We then discuss this as a class to reveal the key ideas.
This is not induction with no instruction.
https://harmsm.github.io/scientific-computing/cheat-sheet.html
Don't worry, this will become clearer as we go.
Questions?
#help
df = pd.read_excel("class-list.xlsx")
df = df[np.array(np.logical_not(np.isnan(df.raw_score)))]
plt.hist(df.raw_score)
plt.xlim(0,20)
plt.xlabel("score")
plt.ylabel("counts")
plt.title("2020, distribution of scores")
Text(0.5, 1.0, '2020, distribution of scores')
plt.plot(df.python,df.score,"o")
fit = np.polyfit(df.python,df.score,1)
plt.plot(np.arange(5),fit[0]*np.arange(5) + fit[1],"-")
plt.title("People are pretty good at self-evaluation")
plt.xlabel("personal evaluation, python skill")
plt.ylabel("score on python quiz")
SStot = np.sum((df.score - np.mean(df.score))**2)
SSreg = np.sum((df.python*fit[0] + fit[1] - df.score)**2)
print("R2:",SSreg/SStot)
R2: 0.5272015185441892
![]() |
![]() |
---|
![]() |
![]() |
---|