Sunday, October 20, 2019

Data Wrangling

Data Wrangling: Group By, Join, Combine, Pivot,
Melt & Reshape.


Data Wrangling:
The process of cleaning the data enough to input the analytical algorithm is called Data Wrangling. It is also called as Data Munging.  

1. Melt:  Reshaping a date from wide to long in pandas python can be done with melt().
  • Step 1: Create a DataFrame: Below data shown is in wide.

  • Step 2: Use melt() as shown below. This data will be in long
    • id_vars=['countires']: ids which need to be left unaltered i.e countries in this case.
    • var_name='metrics': column names changed to metrics.
    • value_name='values': changed to values.
2. Pivot: Riverse of melt() i.e. from long to wide.
  • Step 1: Create a DataFrame: Below data shown is in long.
  • Step 2: Use pivot() as shown below. This data will be in wide
    • index='countires': column used as an index.

3. Group By: Similar to group by in SQL.
  • Step 1: Create a DataFrame: 

  • Step 2: Use pivot() as shown below.
    first(): prints first enteries in all the groups formed.






  • Step 3: mean()  can be used to find the mean of the column/row.
Also,

  • Step 4: unstack()  can also be used to find the mean after group by.

    Click this for more on Group BY Example 


No comments:

Post a Comment