pandas loc vs iloc-ag捕鱼王app官网

当前位置：ag捕鱼王app官网 > 学无止境 > 编程语言 > python >

python php java go typescript c vba node.js c语言 matlab

pandas loc vs iloc

作者：迹忆客最近更新：2024/04/24 浏览次数：

本教程介绍了如何使用 python 中的 loc 和 iloc 从 pandas dataframe 中过滤数据。要使用 iloc 从 dataframe 中过滤元素，我们使用行和列的整数索引，而要使用 loc 从 dataframe 中过滤元素，我们使用行名和列名。

为了演示使用 loc 的数据过滤，我们将使用下面例子中描述的 dataframe。

import pandas as pd
roll_no = [501, 502, 503, 504, 505]
student_df = pd.dataframe(
    {
        "name": ["alice", "steven", "neesham", "chris", "alice"],
        "age": [17, 20, 18, 21, 15],
        "city": ["new york", "portland", "boston", "seattle", "austin"],
        "grade": ["a", "b-", "b ", "a-", "a"],
    },
    index=roll_no,
)
print(student_df)

输出：

        name  age      city grade
501    alice   17  new york     a
502   steven   20  portland    b-
503  neesham   18    boston    b 
504    chris   21   seattle    a-
505    alice   15    austin     a

使用 `.loc()` 方法从 dataframe 中选择指定索引和列标签的特定值

我们可以将索引标签和列标签作为参数传递给 .loc() 方法，以提取给定索引和列标签对应的值。

import pandas as pd
roll_no = [501, 502, 503, 504, 505]
student_df = pd.dataframe(
    {
        "name": ["alice", "steven", "neesham", "chris", "alice"],
        "age": [17, 20, 18, 21, 15],
        "city": ["new york", "portland", "boston", "seattle", "austin"],
        "grade": ["a", "b-", "b ", "a-", "a"],
    },
    index=roll_no,
)
print("the dataframe of students with marks is:")
print(student_df)
print("")
print("the grade of student with roll no. 504 is:")
value = student_df.loc[504, "grade"]
print(value)

输出：

the dataframe of students with marks is:
        name  age      city grade
501    alice   17  new york     a
502   steven   20  portland    b-
503  neesham   18    boston    b 
504    chris   21   seattle    a-
505    alice   15    austin     a
the grade of student with roll no. 504 is:
a-

在 dataframe 中选择索引标签为 504 且列标签为 grade 的值。.loc() 方法的第一个参数代表索引名，第二个参数是指列名。

使用 `.loc()` 方法从 dataframe 中选择特定的列

我们还可以使用 .loc() 方法从 dataframe 中过滤所需的列。我们将所需的列名列表作为第二个参数传递给 .loc() 方法来过滤指定的列。

import pandas as pd
roll_no = [501, 502, 503, 504, 505]
student_df = pd.dataframe(
    {
        "name": ["alice", "steven", "neesham", "chris", "alice"],
        "age": [17, 20, 18, 21, 15],
        "city": ["new york", "portland", "boston", "seattle", "austin"],
        "grade": ["a", "b-", "b ", "a-", "a"],
    },
    index=roll_no,
)
print("the dataframe of students with marks is:")
print(student_df)
print("")
print("the name and age of students in the dataframe are:")
value = student_df.loc[:, ["name", "age"]]
print(value)

输出：

the dataframe of students with marks is:
        name age      city grade
501    alice   17 new york     a
502   steven   20 portland    b-
503 neesham   18    boston    b 
504    chris   21   seattle    a-
505    alice   15    austin     a
the name and age of students in the dataframe are:
        name age
501    alice   17
502   steven   20
503 neesham   18
504    chris   21
505    alice   15

.loc() 的第一个参数是:，它表示 dataframe 中的所有行。同样，我们将 ["name", "age"] 作为第二个参数传递给 .loc() 方法，表示只选择 dataframe 中的 name 和 age 列。

使用 `.loc()` 方法通过对列应用条件来过滤行

我们也可以使用 .loc() 方法过滤满足指定条件的列值的行。

import pandas as pd
roll_no = [501, 502, 503, 504, 505]
student_df = pd.dataframe(
    {
        "name": ["alice", "steven", "neesham", "chris", "alice"],
        "age": [17, 20, 18, 21, 15],
        "city": ["new york", "portland", "boston", "seattle", "austin"],
        "grade": ["a", "b-", "b ", "a-", "a"],
    },
    index=roll_no,
)
print("the dataframe of students with marks is:")
print(student_df)
print("")
print("students with grade a are:")
value = student_df.loc[student_df.grade == "a"]
print(value)

输出：

the dataframe of students with marks is:
        name age      city grade
501    alice   17 new york     a
502   steven   20 portland    b-
503 neesham   18    boston    b 
504    chris   21   seattle    a-
505    alice   15    austin     a
students with grade a are:
      name age      city grade
501 alice   17 new york     a
505 alice   15    austin     a

它选择了 dataframe 中所有成绩为 a 的学生。

使用 `iloc` 通过索引来过滤行

import pandas as pd
roll_no = [501, 502, 503, 504, 505]
student_df = pd.dataframe(
    {
        "name": ["alice", "steven", "neesham", "chris", "alice"],
        "age": [17, 20, 18, 21, 15],
        "city": ["new york", "portland", "boston", "seattle", "austin"],
        "grade": ["a", "b-", "b ", "a-", "a"],
    },
    index=roll_no,
)
print("the dataframe of students with marks is:")
print(student_df)
print("")
print("2nd and 3rd rows in the dataframe:")
filtered_rows = student_df.iloc[[1, 2]]
print(filtered_rows)

输出：

the dataframe of students with marks is:
        name  age      city grade
501    alice   17  new york     a
502   steven   20  portland    b-
503  neesham   18    boston    b 
504    chris   21   seattle    a-
505    alice   15    austin     a
2nd and 3rd rows in the dataframe:
        name  age      city grade
502   steven   20  portland    b-
503  neesham   18    boston    b

它从 dataframe 中过滤第 2 和第 3 行。

我们将行的整数索引作为参数传递给 iloc 方法，以便从 dataframe 中过滤行。在这里，第二和第三行的整数索引分别是 1 和 2，因为索引从 0 开始。

从 dataframe 中过滤特定的行和列

import pandas as pd
roll_no = [501, 502, 503, 504, 505]
student_df = pd.dataframe(
    {
        "name": ["alice", "steven", "neesham", "chris", "alice"],
        "age": [17, 20, 18, 21, 15],
        "city": ["new york", "portland", "boston", "seattle", "austin"],
        "grade": ["a", "b-", "b ", "a-", "a"],
    },
    index=roll_no,
)
print("the dataframe of students with marks is:")
print(student_df)
print("")
print("filtered values from the dataframe:")
filtered_values = student_df.iloc[[1, 2, 3], [0, 3]]
print(filtered_values)

输出：

the dataframe of students with marks is:
        name  age      city grade
501    alice   17  new york     a
502   steven   20  portland    b-
503  neesham   18    boston    b 
504    chris   21   seattle    a-
505    alice   15    austin     a
filtered values from the dataframe:
        name grade
502   steven    b-
503  neesham    b 
504    chris    a-

它从 dataframe 中过滤第 2、3、4 行的第一列和最后一列，即 name 和 grade。我们将行的整数索引列表作为第一个参数，列的整数索引列表作为第二个参数传递给 iloc 方法。

使用 `iloc` 方法从 dataframe 中过滤行和列的范围

为了过滤行和列的范围，我们可以使用列表切片，并将每行和每列的切片作为参数传递给 iloc 方法。

import pandas as pd
roll_no = [501, 502, 503, 504, 505]
student_df = pd.dataframe(
    {
        "name": ["alice", "steven", "neesham", "chris", "alice"],
        "age": [17, 20, 18, 21, 15],
        "city": ["new york", "portland", "boston", "seattle", "austin"],
        "grade": ["a", "b-", "b ", "a-", "a"],
    },
    index=roll_no,
)
print("the dataframe of students with marks is:")
print(student_df)
print("")
print("filtered values from the dataframe:")
filtered_values = student_df.iloc[1:4, 0:2]
print(filtered_values)

输出：

the dataframe of students with marks is:
        name  age      city grade
501    alice   17  new york     a
502   steven   20  portland    b-
503  neesham   18    boston    b 
504    chris   21   seattle    a-
505    alice   15    austin     a
filtered values from the dataframe:
        name  age
502   steven   20
503  neesham   18
504    chris   21

它从 dataframe 中选择第 2、3、4 行和第 1、2 列。1:4 代表索引范围从 1 到 3 的行，4 在范围内是排他性的。同理，0:2 代表索引范围从 0 到 1 的列。

pandas `loc` 与 `iloc` 的比较

要使用 loc() 从 dataframe 中过滤行和列，我们需要传递要过滤掉的行和列的名称。同样，我们需要传递要过滤掉的行和列的整数索引以使用 iloc() 来过滤值。

import pandas as pd
roll_no = [501, 502, 503, 504, 505]
student_df = pd.dataframe(
    {
        "name": ["alice", "steven", "neesham", "chris", "alice"],
        "age": [17, 20, 18, 21, 15],
        "city": ["new york", "portland", "boston", "seattle", "austin"],
        "grade": ["a", "b-", "b ", "a-", "a"],
    },
    index=roll_no,
)
print("the dataframe of students with marks is:")
print(student_df)
print("")
print("filtered values from the dataframe using loc:")
iloc_filtered_values = student_df.loc[[502, 503, 504], ["name", "age"]]
print(iloc_filtered_values)
print("")
print("filtered values from the dataframe using iloc:")
iloc_filtered_values = student_df.iloc[[1, 2, 3], [0, 3]]
print(iloc_filtered_values)

the dataframe of students with marks is:
        name  age      city grade
501    alice   17  new york     a
502   steven   20  portland    b-
503  neesham   18    boston    b 
504    chris   21   seattle    a-
505    alice   15    austin     a
filtered values from the dataframe using loc:
        name  age
502   steven   20
503  neesham   18
504    chris   21
filtered values from the dataframe using iloc:
        name grade
502   steven    b-
503  neesham    b 
504    chris    a-

它显示了我们如何使用 loc 和 iloc 从 dataframe 中过滤相同的值。

上一篇：在 python 中将 pandas 系列的日期时间转换为字符串

下一篇：用多个条件过滤 pandas dataframe

转载请发邮件至 1244347461@qq.com 进行申请，经作者同意之后，转载请以链接形式注明出处

本文地址：

pandas loc vs iloc-ag捕鱼王app官网