Class 7: pandas dataframes

April 13, 2020

Downloads

Session log

Luis Perez
2020-04-13 08:58:45
**Class started**
Luis Perez
2020-04-13 09:07:03
**Entering breakout rooms**
Luis Perez
2020-04-13 09:07:23
Help! We need a TA! (reply in thread if you need a TA)

Cathy Robinson
2020-04-13 09:32:49
Can we get a TA please? (…._Emu)
👍1
Luis Perez
2020-04-13 09:07:33
Share one new thing you learned from the 09_pandasnotebook? (Reply in thread.)

Amy Robbins
2020-04-13 09:14:53
You can call a value by row/column titles or index position

Rachel Lutz
2020-04-13 09:15:30
We can make really accessible tables with panda

Jessica Puente
2020-04-13 09:16:53
Loc= means location

Natanya Villegas
2020-04-13 09:38:20
You can load .csv into pd dataframes

Andrew Holston
2020-04-13 09:38:28
You can save the dataframes to csv and excel files.

Zach Garrison
2020-04-13 09:39:09
there are two ways to access elements within a panda data set and one of which will not work if you try to access a deleted element where the other location command will work as long as the location still exists (even if the data has changed)
Luis Perez
2020-04-13 09:07:52
What should we discuss in the main session from the 09_pandasnotebook? (Reply in thread).

Amy Robbins
2020-04-13 09:14:27
This is probably really simple but we need help on: Change the cell below so it prints out all of column "y" (hint: you can slice)
👍1

Luis Perez
2020-04-13 09:15:36
Try checking what [:,:] does

Rachel Lutz
2020-04-13 09:16:08
Same as Amy. I feel like it should be simple but I am having a bit of a hard time indexing

Luis Perez
2020-04-13 09:16:55
Okay, anytime you have name_of_dataframe.iloc[]you can have slicing inside of the `[]` , so having name_of_dataframe.iloc[:,:]will print all rows and all columns

Roz Carrier
2020-04-13 09:17:14
Help? How do you print just column y (4,5,6) without (a,b,c)?

Rachel Lutz
2020-04-13 09:19:47
@ Roz we got it to work with df.loc[:, "y"]

Katie Fisher
2020-04-13 09:21:17
can you ever get back the data that you remove from your data frame? For example we removed row 2 but can you ever access that row again?

Laura Desban
2020-04-13 09:24:16
Does every cell have to be the same type of object in a panda dataframe? For example, if I want to have a column with single float values but another column with vectors or lists or another one with strings?

Rachel Lutz
2020-04-13 09:25:30
I don't particularly understand this:

Luis Perez
2020-04-13 09:26:19
localways looks stuff up by name, and `iloc` will always go by index
👍1

Joseph Harman
2020-04-13 09:37:46
@Laura Desban you can set any part (or all) of a dataframe to any datatype. You can also change datatype at will using
astype()
For example - change every value in column "a" to an integer: df["a"] = df["a"].astype(int)`Note that you have to explicitly assign df["a"] to save the change within your dataframe. This is really useful if, for example, your data comes off an instrument as strings or some other weird datatype
👍1

Natanya Villegas
2020-04-13 09:38:42
Once we load .csv into a pd dataframe, how can it be processed/analyzed downstream?

Joseph Harman
2020-04-13 09:40:28
@Katie Fisher if you delete from a dataframe, the data is gone from that dataframe. However, you can always 1) save dataframes as you go and/or 2) read in your data again (if it's coming from a .csv, for example)
👍1

Cori Cahoon
2020-04-13 09:41:01
For saving a pandas dataframe to a csv is there a way to specify the directory it will save the csv into?

Joseph Harman
2020-04-13 09:44:25
@Cori Cahoon yep! If i want to save "file1" to "folder1", you would say
http://df.tocsv("folder1/file1")
assuming "folder1" is in the same directory as the notebook you're working in. I use the package osoften for file writing - you can do things like `os.mkdir("folder1")` to make folders from your jupyter notebook environment
👍2

Lila Kaye
2020-04-13 09:46:27
How the Boolean values have anything to do with whats going on

Laura Desban
2020-04-13 09:47:24
Can you create a dataframe with only the column names for information and then fill it progressively with your data? (while specifying rows progressively as well?)
👍1

Joseph Harman
2020-04-13 09:54:07
@Laura Desban here's a quick example of that:
df = pd.DataFrame(columns=["A", "B"])

for i in range(2):
    df.loc[i] = [1,2]
returns: "A" "B" 1 2 1 2 1 2

Laura Desban
2020-04-13 09:58:14
Can you specify the name of the rows too?

Joseph Harman
2020-04-13 10:12:56
yep
df = pd.DataFrame(columns=["A", "B"])

for i in ["x", "y", "z"]:
    df.loc[i] = [1,2]
df
returns: "A" "B" x  1   2 y  1   2 z  1   2
👍1
Polly Poll
2020-04-13 09:08:39
Poll by @Luis Perez
I would rate my understanding of the material in the 09_pandasnotebook as:
😱 ⁢ ⁢ ⁢ ⁢ ⁢ ⁢ ⁢ ⁢ ⁢ ⁢ ⁢ ⁢ ⁢ ⁢ ⁢ ⁢ ⁢ ⁢ ⁢ ⁢ ⁢ ⁢ ⁢ ⁢ ⁢0
🙁⁢ ⁢ ⁢ ⁢ ⁢ ⁢ ⁢ ⁢ ⁢ ⁢ ⁢ ⁢ ⁢ ⁢ ⁢ ⁢ ⁢ ⁢ ⁢ ⁢ ⁢ ⁢ ⁢ ⁢ ⁢ 0
😐█████ ⁢ ⁢ ⁢ ⁢ ⁢ ⁢ ⁢ ⁢ ⁢ ⁢ ⁢ ⁢ ⁢ ⁢ ⁢ ⁢ ⁢ ⁢ ⁢ ⁢5
🙂█████████████ ⁢ ⁢ ⁢ ⁢ ⁢ ⁢ ⁢ ⁢ ⁢ ⁢ ⁢ ⁢13
😄██████ ⁢ ⁢ ⁢ ⁢ ⁢ ⁢ ⁢ ⁢ ⁢ ⁢ ⁢ ⁢ ⁢ ⁢ ⁢ ⁢ ⁢ ⁢ ⁢6
Cathy Robinson
2020-04-13 09:32:15
Can we get a TA in Emu?
👍1
Luis Perez
2020-04-13 09:38:09
**Leaving breakout rooms**
Luis Perez
2020-04-13 09:54:57
**Class ending**