Data Wrangling: Group By, Join, Combine, Pivot,
Melt & Reshape.
Data Wrangling:
The process of cleaning the data enough to input the analytical algorithm is called Data Wrangling. It is also called as Data Munging.
1. Melt: Reshaping a date from wide to long in pandas python can be done with melt().
- Step 1: Create a DataFrame: Below data shown is in wide.
- Step 2: Use melt() as shown below. This data will be in long.
- id_vars=['countires']: ids which need to be left unaltered i.e countries in this case.
- var_name='metrics': column names changed to metrics.
- value_name='values': changed to values.
2. Pivot: Riverse of melt() i.e. from long to wide.
- Step 2: Use pivot() as shown below. This data will be in wide.
- index='countires': column used as an index.
3. Group By: Similar to group by in SQL.
- Step 1: Create a DataFrame:
- Step 2: Use pivot() as shown below.first(): prints first enteries in all the groups formed.
- Step 3: mean() can be used to find the mean of the column/row.
Also,
- Step 4: unstack() can also be used to find the mean after group by.
Click this for more on Group BY Example
No comments:
Post a Comment