Friday, 29 August 2025

Python Pandas - 3

 A. Math Function
B. Date Function
C. Text Function
D. Aggregate Function
E. Group by, Having, Order by

A. Math Function
-----------------------
1.power()
2.round()
3.mod()
------------
Ex1: Using power() function
---
import pandas as pd
import numpy as np

s1 ={'a':10,'b':5,'c':3}
Data ={'Score':s1}

df= pd.DataFrame(Data) 
df['Score']=np.power((df['Score']),3)
print(df)

output:
--------
     Score
a   1000
b    125
c     27
===================================  
Ex2: Using round() function
---
import pandas as pd
import numpy as np

s1 ={'a':1.5,'b':2.6,'c':3.8}
Data ={'Score':s1}

df= pd.DataFrame(Data) 
df['Score']=np.round((df['Score']))
print(df)

output:
--------
  Score
a    2.0
b    3.0
c    4.0
===================================  
Ex3: Using mod() function
---
import pandas as pd
import numpy as np

s1 ={'a':10,'b':25,'c':30}
Data ={'Score':s1}

df= pd.DataFrame(Data) 
df['Score']=np.mod((df['Score']),3)
print(df)

output:
--------
    Score
a      1
b      1
c      0
===================================

B. Date Function:
---------------------
1. now() - Display current Date
2. date_range()  - display date range
3. day_name()  - day_name() function is used to get the day names of the Date
4. month_name() - get the month names of the Date
5. month - get the months of the Date
6. day - get the day of the Date
7. year - get the year of the Date
------------------------------------------------------------------
Ex1 : Display current Date
-----
import pandas as pd
date = pd.Timestamp(2020)
print(date.now())

Output:
---------
2020-08-18 17:22:17.661221
=====================================
Ex2:   Display Date Range
-----
import pandas as pd
date = pd.date_range(start='2020-01-01', freq='D', periods=5)
print(date)

output:
--------
DatetimeIndex(['2020-01-01', '2020-01-02', '2020-01-03',
                        '2020-01-04', '2020-01-05'],
===================================
Ex3: using day_name() function
-----
import pandas as pd
date = pd.date_range(start='2020-08-18', freq='D', periods=5)
print(date)
print(date.day_name())

output:
---------
DatetimeIndex(['2020-08-18', '2020-08-19', '2020-08-20', '2020-08-21',
               '2020-08-22'],dtype='datetime64[ns]', freq='D')
Index(['Tuesday', 'Wednesday', 'Thursday', 'Friday', 'Saturday'], dtype='object')
====================================
Ex4: using month_name() function
-----
import pandas as pd
date = pd.date_range(start='2020-08-18', freq='M', periods=5)
print(date)
print(date.month_name())

output:
---------
DatetimeIndex(['2020-08-31', '2020-09-30', '2020-10-31', '2020-11-30',
               '2020-12-31'],dtype='datetime64[ns]', freq='M')
Index(['August', 'September', 'October', 'November', 'December'], dtype='object')====================================
Ex5: using month function
-----
import pandas as pd
date = pd.date_range(start='2020-08-18', freq='M', periods=5)
print(date)
print(date.month)

output:
---------
DatetimeIndex(['2020-08-31', '2020-09-30', '2020-10-31', '2020-11-30',
               '2020-12-31'], dtype='datetime64[ns]', freq='M')
Int64Index([8, 9, 10, 11, 12], dtype='int64')====================================
Ex6: using day function
-----
import pandas as pd
date = pd.date_range(start='2020-08-18', freq='M', periods=5)
print(date)
print(date.day)

output:
---------
DatetimeIndex(['2020-08-31', '2020-09-30', '2020-10-31', '2020-11-30',
               '2020-12-31'],dtype='datetime64[ns]', freq='M')
Int64Index([31, 30, 31, 30, 31], dtype='int64')====================================
Ex7: using year function
-----
import pandas as pd
date = pd.date_range(start='2020-08-18', freq='Y', periods=5)
print(date)
print(date.year)

output:
---------
DatetimeIndex(['2020-12-31', '2021-12-31', '2022-12-31', '2023-12-31',
               '2024-12-31'],dtype='datetime64[ns]', freq='A-DEC')
Int64Index([2020, 2021, 2022, 2023, 2024], dtype='int64')====================================

C. Text Function:
  --------------------
1. upper() - uppercase
2. lower() - lowercase
3. len()     - find length of the string
4. lstrip()  - remove left space(LTRIM())
5. rstrip()  - remove right space(RTRIM())
6. strip()  - remove both left and right space(TRIM())
7. slice()  - substring(start to end) - MID(),SUBSTR(),SUBSTRING()
8. left()    - extract specific characters within a string(left to right)
9. right()  - extract specific characters within a string(right to left)
------------------------------
Ex1: using UPPER function
-----
import pandas as pd
s1 ={0:'jahab',1:'sudhir',2:'rino'}
Data ={'Name':s1}
df= pd.DataFrame(Data) 
df['Name']=df['Name'].str.upper()
print(df)

output:
--------
      Name
0   JAHAB
1   SUDHIR
2   RINO
================================
Ex2: using LOWER function
-----
import pandas as pd
s1 ={0:'JAHAB',1:'SUDHIR',2:'RINO'}
Data ={'Name':s1}
df= pd.DataFrame(Data) 
df['Name']=df['Name'].str.lower()
print(df)

output:
--------
      Name
0    jahab
1    sudhir
2    rino
================================
Ex3: using len() function
-----
import pandas as pd
s1 ={0:'jahab',1:'sudhir',2:'rino'}
Data ={'Name':s1}
df= pd.DataFrame(Data) 
df['Name']=df['Name'].str.len()
print(df)

output:
--------
      Name
0    5
1    6
2    4
================================
Ex4: using lstrip() function - Remove left space
-----
import pandas as pd
s1 ={0:'  jahab',1:'sudhir  ',2:'   rino'}
Data ={'Name':s1}
df= pd.DataFrame(Data) 
df['Name']=df['Name'].str.lstrip()
print(df)

output:
--------
          Name
0        jahab
1    sudhir
2        rino
================================
Ex5: using rstrip() function - Remove right space
-----
import pandas as pd
s1 ={0:'  jahab',1:'sudhir  ',2:'   rino'}
Data ={'Name':s1}
df= pd.DataFrame(Data) 
df['Name']=df['Name'].str.rstrip()
print(df)

output:
--------
          Name
0        jahab
1       sudhir
2          rino
================================
Ex6: using strip() function - Remove both right and left space
-----
import pandas as pd
s1 ={0:'  jahab',1:'sudhir  ',2:'   rino'}
Data ={'Name':s1}
df= pd.DataFrame(Data) 
df['Name']=df['Name'].str.strip()
print(df)

output:
--------
          Name
0        jahab
1        sudhir
2        rino
================================
Ex7: using slice() function - substring of the given string
-----
import pandas as pd
s1 ={0:'jahab',1:'sudhir',2:'rino'}
Data ={'Name':s1}
df= pd.DataFrame(Data) 
df['Name']=df['Name'].str.slice(1,3)
print(df)

output:
--------
          Name
0        ah
1        ud
2        in
================================
Ex8: using left() function
-----
import pandas as pd

s1 ={0:'jahab',1:'sudhir',2:'rino'}
Data ={'Name':s1}

df= pd.DataFrame(Data) 
df['Name']=df['Name'].str[:3]
print(df)

output:
--------
          Name
0        jah
1        jud
2        rin
================================
Ex9: using right() function
-----
import pandas as pd
s1 ={0:'jahab',1:'sudhir',2:'rino'}
Data ={'Name':s1}
df= pd.DataFrame(Data) 
df['Name']=df['Name'].str[3:]
print(df)

output:
--------
          Name
0        hab
1        hir
2        ino
================================
D. Aggregate Function
-----------------------------
1. sum() - sum of the values
2. max() - max of the values
3. min() - min of the values
4. count() - count values
5. count(*) - count all values
------------------------------
Ex1: using sum() function
-----
import pandas as pd

s1 ={'a':10,'b':5,'c':3}
Data ={'Score':s1}
df= pd.DataFrame(Data) 
print(df.aggregate(['sum']))

Output:
---------
     Score
sum  18
=========================
Ex2: using max() function
-----
import pandas as pd

s1 ={'a':10,'b':5,'c':3}
Data ={'Score':s1}
df= pd.DataFrame(Data) 
print(df.aggregate(['max']))

Output:
---------
     Score
max  10
=========================
Ex3: using min() function
-----
import pandas as pd

s1 ={'a':10,'b':5,'c':3}
Data ={'Score':s1}
df= pd.DataFrame(Data) 
print(df.aggregate(['min']))

Output:
---------
     Score
min  3
=========================
Ex4: using count() function
-----
import pandas as pd

s1 ={'a':10,'b':5,'c':3}
Data ={'Score':s1}
df= pd.DataFrame(Data) 
print(df.aggregate(['count']))

Output:
---------
     Score
count  3
===============================
E. Group by Function: 
    -------------------------
   * Group by function is used to split the data into groups based on some criteria
   step1: Splitting the Object
   step2: Applying a function
   step3: Combining the results

Ex1:
    import pandas as pd
    s1={0:'Riders', 1:'Riders', 2:'Devils', 3:'Devils', 4:'Kings'}
    s2={0:1, 1:2,  2:2, 3:3, 4:3}
    s3={0:2015, 1:2016, 2:2017, 3:2018, 4:2019}
    s4={0:876, 1:789, 2:863, 3:673, 4:741}
    
    Data={'Team':s1, 'Place':s2, 'Year':s3, 'Points':s4}
    df = pd.DataFrame(Data)
    print(df)
    print(df.groupby(['Team']).groups)

Output:
----------
Team  Place  Year  Points
0  Riders      1  2015     876
1  Riders      2  2016     789
2  Devils      2  2017     863
3  Devils      3  2018     673
4   Kings      3  2019     741
{'Devils': [2, 3], 'Kings': [4], 'Riders': [0, 1]}
======================================
Ex2:
import pandas as pd
s1={0:'Riders',1:'Riders',2:'Devils',3:'Devils',4:'Kings'}
s2={0:1,1:2,2:2,3:3,4:3}
s3={0:2015,1:2016,2:2017,3:2018,4:2019}
s4={0:876,1:789,2:863,3:673,4:741}
Data={'Team':s1, 'Place':s2, 'Year':s3, 'Points':s4}
df = pd.DataFrame(Data)
print(df)
print(df.groupby('Team').groups)
print(df.groupby('Team').filter(lambda x: len(x) >= 3))

output:
---------
Team  Place  Year  Points
0  Riders      1  2015     876
1  Riders      2  2016     789
2  Devils      2  2017     863
3  Devils      3  2018     673
4   Kings      3  2019     741
{'Devils': [2, 3], 'Kings': [4], 'Riders': [0, 1]}
Empty DataFrame
Columns: [Team, Place, Year, Points]
Index: []
> Team  Place  Year  Points
0  Riders      1  2015     876
1  Riders      2  2016     789
2  Devils      2  2017     863
3  Devils      3  2018     673
4   Kings      3  2019     741
{'Devils': [2, 3], 'Kings': [4], 'Riders': [0, 1]}
Empty DataFrame
Columns: [Team, Place, Year, Points]

No comments:

Post a Comment

Slide - CSS

   <html> <head> <style> /* Slideshow container */ .slideshow-container {   max-width: 1000px;   position: relative;   mar...