Aggregating and grouping rows in SQL is performed using
GROUP BY
. In pandas, .groupby()
is applied to achieve the same results.
Grouping rows by a single column in pandas for aggregating data
In SQL:
SELECT
column_1,
COUNT(column_2),
AVG(column_2)
FROM
table
GROUP BY
column_1;
In pandas:
tips
.groupby('column_1')
.agg({'column_2': ['size', 'mean']})
Grouping rows by multiple columns in pandas for aggregating data
In SQL:
SELECT
column_1,
column_3,
COUNT(column_2),
AVG(column_2)
FROM
table
GROUP BY
column_1,
column_3;
In pandas:
tips
.groupby(['column_1', 'column_3'])
.agg({'column_2': ['size', 'mean']})