Handling CSV files as a system administrator can be a bit of effort. The Python CSV Processing Scripts shared in this article can automate these processes, making your job more efficient. Here, I’ve curated a list of 10 essential Python CSV Processing Scripts tailored for system administration tasks.
Table of Contents
Read CSV file and print each row
Simply reads a CSV file and prints each row to the console.
import csv
with open('file.csv', 'r') as csvfile:
csvreader = csv.reader(csvfile)
for row in csvreader:
print(row)
csv file:
Name,Age,City
John,25,New York
Jane,30,Los Angeles
output:
['Name', 'Age', 'City']
['John', '25', 'New York']
['Jane', '30', 'Los Angeles']
Write data to a CSV file
Writes data to a CSV file.
import csv
data = [['Name', 'Age', 'City'], ['John', '25', 'New York'], ['Jane', '30', 'Los Angeles']]
with open('file.csv', 'w', newline='') as csvfile:
csvwriter = csv.writer(csvfile)
csvwriter.writerows(data)
Output file:
Name,Age,City
John,25,New York
Jane,30,Los Angeles
Count number of rows in a CSV file
Counts the number of rows in a CSV file.
import csv
with open('file.csv', 'r') as csvfile:
csvreader = csv.reader(csvfile)
row_count = sum(1 for row in csvreader)
print(row_count)
csv file:
Name,Age,City
John,25,New York
Jane,30,Los Angeles
output:
3
Filter data in a CSV file
Filters data in a CSV file based on a condition.
import csv
with open('file.csv', 'r') as csvfile:
csvreader = csv.reader(csvfile)
for row in csvreader:
if row[2].strip() == 'New York':
print(row)
csv file:
Name,Age,City
John,25,New York
Jane,30,Los Angeles
output:
['John', '25', 'New York']
Sort data in a CSV file
Sorts data in a CSV file based on a column.
import csv
with open('sort.csv', 'r') as csvfile:
csvreader = csv.reader(csvfile)
sorted_csv = sorted(csvreader, key=lambda row: row[1])
for row in sorted_csv:
print(row)
csv file:
Name,Age,City
Jane,30,Los Angeles
John,25,New York
output:
['John', '25', 'New York']
['Jane', '30', 'Los Angeles']
Extract specific columns from a CSV file
Extract specific columns from a CSV file.
import csv
input_file = 'file.csv'
columns_to_extract = ['Age', 'Name'] # Replace with the header names of the columns you want to extract
with open(input_file, 'r') as csv_file:
reader = csv.DictReader(csv_file)
headers = reader.fieldnames
headers_to_print = [header for header in headers if header in columns_to_extract]
print(','.join(headers_to_print))
for row in reader:
row_to_print = [row[column] for column in columns_to_extract]
print(','.join(row_to_print))
csv file:
Name,Age,City
John,25,New York
Jane,30,Los Angeles
output:
Name,Age
25,John
30,Jane
Replace values in a CSV file
Replaces values in a CSV file based on a condition.
import csv
with open('file.csv', 'r') as csvfile:
csvreader = csv.reader(csvfile)
data = []
for row in csvreader:
if row[2] == 'New York':
row[2] = 'NY'
data.append(row)
with open('updated_file.csv', 'w', newline='') as outfile:
csvwriter = csv.writer(outfile)
csvwriter.writerows(data)
csv file:
Name,Age,City
John,25,New York
Jane,30,Los Angeles
output:
Name,Age,City
John,25,NY
Jane,30,Los Angeles
Calculate average of a column in a CSV file
Calculates the average of a column in a CSV file.
import csv
with open('file.csv', 'r') as csvfile:
csvreader = csv.reader(csvfile)
next(csvreader)
total = 0
count = 0
for row in csvreader:
total += int(row[1])
count += 1
average = total / count
print(average)
csv file:
Name,Age,City
John,25,New York
Jane,30,Los Angeles
output:
27.5
Remove duplicates from a CSV file
Removes duplicates from a CSV file based on a column.
import csv
with open('uniq.csv', 'r') as csvfile:
csvreader = csv.reader(csvfile)
data = []
seen = set()
for row in csvreader:
if row[2] not in seen:
data.append(row)
seen.add(row[2])
with open('unique_file.csv', 'w', newline='') as outfile:
csvwriter = csv.writer(outfile)
csvwriter.writerows(data)
csv file:
Name,Age,City
John,25,New York
Jane,30,Los Angeles
John,25,New York
output:
Name,Age,City
John,25,New York
Jane,30,Los Angeles
Sort a CSV file based on a column
Sorts a CSV file based on a column.
import csv
with open('sortcl.csv', 'r') as csvfile:
csvreader = csv.reader(csvfile)
header = next(csvreader)
index = header.index('Age')
sorted_data = sorted(csvreader, key=lambda row: row[index])
print(header)
for row in sorted_data:
print(row)
csv file:
Name,Age,City
John,25,New York
Jane,30,Los Angeles
John,27,New York
output:
['Name', 'Age', 'City']
['John', '25', 'New York']
['John', '27', 'New York']
['Jane', '30', 'Los Angeles']
Link to GitHub Gist : https://gist.github.com/linuxinsights/ff64b25ec20513c13083673ed7c0dc39
Conclusion
Python is a powerful tool for automating system maintenance jobs that involve processing CSV files. The 10 scripts we shared in this post can do a wide range of things, such as screening, joining, cleaning, and sorting data. By using these scripts, you can streamline your processes, save time and effort, and make sure that your data handling tasks are accurate and consistent. We hope that these tools will encourage you to learn more about what Python can do for CSV processing and help you become a better system administrator.
Got any queries or feedback? Feel free to drop a comment below!