Deleting rows with Python in a CSV file

All we need is an easy explanation of the problem, so here it is.

All I would like to do is delete a row if it has a value of ‘0’ in the third column. An example of the data would be something like:

6.5, 5.4, 0, 320
6.5, 5.4, 1, 320

So the first row would need to be deleted whereas the second would stay.

What I have so far is as follows:

import csv
input = open('first.csv', 'rb')
output = open('first_edit.csv', 'wb')
writer = csv.writer(output)
for row in csv.reader(input):
    if row[2]!=0:

Any help would be great

Method 1

You are very close; currently you compare the row[2] with integer 0, make the comparison with the string "0". When you read the data from a file, it is a string and not an integer, so that is why your integer check fails currently:


Also, you can use the with keyword to make the current code slightly more pythonic so that the lines in your code are reduced and you can omit the .close statements:

import csv
with open('first.csv', 'rb') as inp, open('first_edit.csv', 'wb') as out:
    writer = csv.writer(out)
    for row in csv.reader(inp):
        if row[2] != "0":

Note that input is a Python builtin, so I’ve used another variable name instead.

Edit: The values in your csv file’s rows are comma and space separated; In a normal csv, they would be simply comma separated and a check against "0" would work, so you can either use strip(row[2]) != 0, or check against " 0".

The better solution would be to correct the csv format, but in case you want to persist with the current one, the following will work with your given csv file format:

$ cat 
import csv
with open('first.csv', 'rb') as inp, open('first_edit.csv', 'wb') as out:
    writer = csv.writer(out)
    for row in csv.reader(inp):
        if row[2] != " 0":
$ cat first.csv 
6.5, 5.4, 0, 320
6.5, 5.4, 1, 320
$ python 
$ cat first_edit.csv 
6.5, 5.4, 1, 320

Method 2

Use pandas amazing library:

The solution for the question:

import pandas as pd

df = pd.read_csv(file)
df =  df[ != "dog"] 

# df.column_name != whole string from the cell
# now, all the rows with the column: Name and Value: "dog" will be deleted

df.to_csv(file, index=False)

General generic solution:

Use this function:

def remove_specific_row_from_csv(file, column_name, *args):
    :param file: file to remove the rows from
    :param column_name: The column that determines which row will be 
           deleted (e.g. if Column == Name and row-*args
           contains "Gavri", All rows that contain this word will be deleted)
    :param args: Strings from the rows according to the conditions with 
                 the column
    row_to_remove = []
    for row_name in args:
        df = pd.read_csv(file)
        for row in row_to_remove:
            df = df[eval("df.{}".format(column_name)) != row]
        df.to_csv(file, index=False)
    except Exception  as e:
        raise Exception("Error message....")

Function implementation:

remove_specific_row_from_csv(file_name, "column_name", "dog_for_example", "cat_for_example")

Note: In this function, you can send unlimited cells of strings and all these rows will be deleted (assuming they exist in the single-column sent).

Method 3

You should have if row[2] != "0". Otherwise it’s not checking to see if the string value is equal to 0.

