Wednesday, December 16, 2009

Finding potential problems in CSV files

I have to periodically work with CSV files from a variety of sources and the problems with them are pretty well known. Here is a little python script I use that helps me find values that contain double quotes, which are often not properly escaped in the files I receive

flines = open(fn,"r").readlines()
currline = flines[1]

for l in flines:
vals = l.split(",")
for v in vals:
if '"' in v.strip()[1:-1]:
print v

No comments: