4 DataFrames.jl - 4.2 Index and Summarize - 《Julia Data Science》

To retrieve a vector for name, we can access the DataFrame with the ., as we did previously with structs in Section 3:

function names_grades1()
    df = grades_2020()
    df.name
end
JDS.names_grades1()

["Sally", "Bob", "Alice", "Hank"]

or we can index a DataFrame much like an Array with symbols and special characters. The second index is the column indexing:

function names_grades2()
    df = grades_2020()
    df[!, :name]
JDS.names_grades2()

Note that df.name is exactly the same as df[!, :name], which you can verify yourself by doing:

For any row, say the second row, we can use the first index as row indexing:

df = grades_2020()
df[2, :]

or create a function to give us any row i we want:

function grade_2020(i::Int)
    df = grades_2020()
    df[i, :]
end
JDS.grade_2020(2)

We can also get only names for the first 2 rows using slicing (again similar to an Array):

["Sally", "Bob"]

5.0

which works because zip loops through df.name and df.grade_2020 at the same time like a “zipper”:

df = grades_2020()
collect(zip(df.name, df.grade_2020))

("Sally", 1.0)

("Bob", 5.0)

("Hank", 4.0)

However, converting a DataFrame to a Dict is only useful when the elements are unique. Generally that is not the case and that’s why we need to learn how to filter a DataFrame.