To retrieve a vector for name
, we can access the DataFrame
with the .
, as we did previously with struct
s in Section 3:
function names_grades1()
df = grades_2020()
df.name
end
JDS.names_grades1()
["Sally", "Bob", "Alice", "Hank"]
or we can index a DataFrame
much like an Array
with symbols and special characters. The second index is the column indexing:
function names_grades2()
df = grades_2020()
df[!, :name]
JDS.names_grades2()
Note that df.name
is exactly the same as df[!, :name]
, which you can verify yourself by doing:
For any row, say the second row, we can use the first index as row indexing:
df = grades_2020()
df[2, :]
or create a function to give us any row i
we want:
function grade_2020(i::Int)
df = grades_2020()
df[i, :]
end
JDS.grade_2020(2)
We can also get only names
for the first 2 rows using slicing (again similar to an Array
):
["Sally", "Bob"]
5.0
which works because zip
loops through df.name
and df.grade_2020
at the same time like a “zipper”:
df = grades_2020()
collect(zip(df.name, df.grade_2020))
("Sally", 1.0)
("Bob", 5.0)
("Hank", 4.0)
However, converting a DataFrame
to a Dict
is only useful when the elements are unique. Generally that is not the case and that’s why we need to learn how to filter
a DataFrame
.