CSV ASSIGNMENT
Write the source code in the notebook.
1. Suppose a CSV file named Employee.csv has the following content:
Empid, Name, Age, City, Salary
100, Ritesh, 25, Mumbai, 15000
101, Aakash, 26, Goa, 16000
102, Mahima, 27, Hyderabad, 20000
103, Lakshay, 23, Delhi, 18000
104, Manu, 25, Mumbai, 25000
105, Nidhi, 26, Delhi
106, Geetu, 30, Bangalore, 28000
Example 1:
# Reading the Emplyee.csv data in a dataframe
import pandas as pd
df = pd.read_csv(“E:\\Data\\Employee.csv”)
print(df)
One thing to remember is that the missing values from the CSV fill shall be treated
as NaN(Not a Number) in pandas dataframe.
We can see the total number of rows (records) and columns (fields) present in the
table with the help of shape command.
>>>df.shape
(8, 5)
2. Reading CSV file with specific columns
We can read specific columns from a CSV file using usecols attribute of read_csv()
function.
Example 2:
import pandas as pd
df = pd.read_csv(“E:\\Data\\Employee.csv”, usecols = [‘Name’, ‘Age’,‘Salary’])
print(df)
3. Reading CSV file with specific rows
You can display selective record/rows using nrows option or attribute used with
read_csv() method.
Example 3:
import pandas as pd
df = pd.read_csv(“E:\\Data\\Employee.csv”, nrows = 5)
print(df)
4. Reading CSV file without header
If you do not want to display the first row as the header for dataframe using
Employee table, then this can be done by specifying None argument for header
option or skiprows option using read_csv() method.
Example 4:
import pandas as pd
df = pd.read_csv(“E:\\Data\\Employee.csv”, nrows = 5, header = None)
print(df)
5. Reading CSV file without index
You can also read and load the record into a dataframe without displaying their
respective index number by specifying the attribute index_col = 0 using the
read_csv() method.
Example 5:
import pandas as pd
df = pd.read_csv(“E:\\Data\\Employee.csv”, nrows = 5, index_col = 0)
print(df)
6. Reading CSV file with new column names
You can read the CSV file into a dataframe with new column names using name
attribute.
Example 6:
import pandas as pd
df = pd.read_csv(“E:\\Data\\Employee.csv”, nrows = 5, skiprows = 1,
name = [‘E_id’, ‘Ename’, ‘E_age’, ‘Ecity’, ‘Esalary’])
print(df)
7. Writing a CSV file with default index
To create a CSV file from a dataframe, the to_csv() method is used.
Example 7:
# Copying the content of Employee.csv to a new CSV file
import pandas as pd
df = pd.read_csv(“E:\\Data\\Employee.csv”)
df.to_csv(“E:\\Data\\Empnew.csv”)
Upon executing the above commands, Empnew.scv file shall be created containing the same content as Employee.csv with default index values.
Example 8:
# Copying the content of Employee.csv to a new CSV file
import pandas as pd
student = {'RollNo' : [1,2,3,4,5,6],
'StudName' : ['Teena' ,'Rinku', 'Payal', 'Akshay', 'Garvit','Yogesh'],
'Marks' : [90, 78, 89, 77, 97,98],
'Class' : ['11A', '11B', '11C', '11D', '11E', '11F']}
df = pd.DataFrame(student, columns = ['RollNo', 'StudName', 'Marks','Class'])
df.to_csv("D:\\Student.csv")
9. Copying Fields into a New File
In certain situations, it is required to create a duplicate file containing only the
selected fields. For this purpose we use columns attribute of to_csv() function.
Example 9:
# Creating a duplicate csv file with selective columns
import pandas as pd
df = pd.read_csv(“E:\\Data\\Employee.csv”)
df.to_csv(“E:\\Data\\Emp.csv”, columns = [‘Empid’, ‘Name’])
The above code will create a new CSV file Emp.csv with only Empid and Name columns.
***********************************************************
#**********************Saving
DataFrame as CSV *******************************
dfE=pd.DataFrame({'Empno':[100,101,102,103,104,105,106,107,108,109,110,111,112],
'Name':['Sunita Sharma','Ashok Singhal',
'Sumit
Avasti','Jyoti Lamba','Martin S.','Binod Goel',
'Chetan
Gupta','Sudhir Rawat','Kavita Sharma',
'Tushar
Tiwari','Anand Rathi','Sumit Vats','Manoj Kaushik'],
'Department':['RESEARCH','SALES','SALES',
'RESEARCH','SALES','SALES','ACCOUNTS','RESEARCH',
'ACCOUNTS','SALES','OPERATIONS','RESEARCH','OPERATIONS'],
'Salary':[45600,43900,27000,45900,32500,45200,36800,
37000,42900,49500,41600,47800,43600],
'Commission':[5600,3900,7000,4900,3500,4200,6800,7000,
4900,4500,8200,np.nan,np.nan],
'Job':['CLERK','SALESMAN','SALESMAN','MANAGER',
'SALESMAN','MANAGER','MANAGER','ANALYST','CLERK',
'MANAGER','SR_MANAGER','SR_MANAGER','CLERK']})
print(dfE)
print("*"*50)
dfE.to_csv("D:\GRADE
XII PYTHON/mihir1.csv")
CSV ASSIGNMENT_2
1.# Importing and exporting data between
pandas and CSV file.
# To create and open a data frame using
‘Student_result.csv’ file using Pandas.
# To display row labels, column labels data
types of each column and the dimensions
# To display the shape (number of rows and
columns) of the CSV file.
Sol:
import pandas as pd
#import csv
#Reading the Data
df =
pd.read_csv("student_result.csv")
# Display Name of Columns
print(df.columns)
# Display no of rows and column
print(df.shape)
# Display Column Names and their types
print(df.info())
2. Read the ‘Student_result.csv’ to create a data frame and do the
following operation:
# To display Adm_No, Gender and Percentage from ‘student_result.csv’ file.
# To display the first 5 and last 5 records from ‘student_result.csv’ file.
Sol:
import pandas as pd
#import csv
#To display Adm_No, Gender and Percentage from
‘student_result.csv’ file.
df = pd.read_csv("student_result.csv",usecols
= ['ADM_NO','GENDER', 'PERCENTAGE'])
print("To display Adm_No, Gender and
Percentage from ‘student_result.csv’ file.")
print(df)
#To display first 5 and last 5 records from
‘student_result.csv’ file.
df1 =
pd.read_csv("student_result.csv")
print(df1.head())
print(df1.tail())
3.# Read the
‘Student_result.csv’ to create a data frame and do the following operation:
# To display
Student_result file with new column names.
# To modify the
Percentage of student below 40 with NaN value in dataframe.
import pandas as
pd
import numpy as
np
import csv
df =
pd.read_csv("student_result.csv")
print(df)
#To display
Student_result file with new column names.
df1 =
pd.read_csv("student_result.csv",skiprows = 1,names =
['Adno','Sex','Name','Eng','Hin',
'Maths','Sc.','SSt','San','IT','Perc'])
print("To
display Student_result file with new column names")
print(df1)
# To modify the
Percentage of student below 40 with NaN value.
df2 = pd.read_csv("student_result.csv")
print(df2)
print("To
modify the Percentage of student below 40 with NaN value.")
df2.loc[(df2['PERCENTAGE']
<40, 'PERCENTAGE')] = np.nan
print(df2)
4. # Read the
‘Student_result.csv’ to create a data frame and do the following operation:
# To create a
duplicate file for ‘student_result.csv’ containing Adm_No, Name and Percentage.
# Write the
statement in Pandas to find the highest percentage and also print the student’s
name and percentage.
import pandas as
pd
import numpy as
np
import csv
# To create a
duplicate file for ‘student_result.csv’ containing Adm_No, Name and Percentage.
df =
pd.read_csv("student_result.csv")
df.to_csv('copyStudent_result.csv',columns=['ADM_NO',"STUDENT'S_NAME","PERCENTAGE"])
# Display Copied
Dataframe
df2=pd.read_csv("copyStudent_result.csv")
print(df2)
# find the
highest percentage and also print the student’s name and percentage.
df1 =
pd.read_csv("student_result.csv")
df1 =
df1[["STUDENT'S_NAME",'PERCENTAGE']]
[df1.PERCENTAGE==
df1['PERCENTAGE'].max()]
print(df1)
5. # Replace all
negative values in a data frame with a 0.
import pandas as
pd
data =
{'sales1':[10,20,-4,5,-1,15], 'sales2':[20,15,10,-1,12,-2]}
df =
pd.DataFrame(data)
print("Data
Frame")
print(df)
print('Display DataFrame
after replacing every negative value with 0')
df[df<0]=0
print(df)
6. import pandas
as pd
import numpy as
np
Srec={'sid':[101,102,103,104,np.nan,106,107,108,109,110],
'sname':['Amit','Sumit',np.nan,'Aman','Rama','Neeta','Amjad','Ram','Ilma','Raja'],
'smarks':[98,67,np.nan,56,38,98,67,np.nan,56,np.nan],
'sgrade':[np.nan,np.nan,'A1','C1','D','A1','B2',np.nan,'B2','A2'],
'remark':['P','P','P','F',np.nan,'P','P','F','P','P'],
'mobile':[9990009991,9990009992,9990009993,np.nan,9990009995,np.nan,
9990009997,
9990009998,
np.nan,9999010000]}
# Convert the
dictionary into DataFrame
df=pd.DataFrame(Srec)
print("\n-
Dataframe Before Replacing NaN with 999-\n")
print(df)
#Replace missing
value with zeros
print("\n-After
Replacing missing value with 999-\n")
df=df.fillna(999)
print(df)
X--------------------------------------------X-------------------------------------------------X
Comments
Post a Comment