It is currently Mon May 21, 2018 9:05 am

delimiting text commas

All times are UTC - 6 hours

Post new topic Reply to topic  [ 2 posts ] 
Author Message
 PostPosted: Sat Sep 19, 2009 3:22 pm   

Joined: Thu Oct 09, 2008 3:26 am
Posts: 15
Location: Columbus, OH
I've got a huge (156.8MB) csv file and tried to extract one column using awk:
awk -F"," '{print $29}' Parcel.CSV
and found that some entries for the owners address have "city, state" formats that give improper output because of the extra comma. Is there any way to delimit commas within double-quoted fields? Open Office doesn't have a problem with this, but only opens the first 65,536 entries.

Sample with no prob:
"230-000000",0,0,0,13300,0,13300,"O079A","007.00",100,2050,2511," ","PAUL A & BONNIE G LASTNAME","","9620 ROAD CHAPEL GEO RD","W JEFFERSON OH 43162","12/08/1999","LASTNAME PAUL A","LASTNAME BONNIE G",""," 83.00","N","F",1,3.02,2775,39.00,"ROAD RD","ROAD RD REAR","ENTRY 6305","3.015 ACRES","PLEASANT TOWNSHIP",0," ",0,0,0,0," ","","","0",""," "," .0",0,"1"," 0"

Sample with prob:
"230-000001",0,0,0,25600,0,25600,"O079A","010.00",501,0,2511," ","WELLS FARGO REAL ESTATE","MAC X2302-04D","1 HOME CAMPUS","DES MOINES, IA 50328","11/04/1998","LASTNAME LYNETTE L &","SCOTT A",""," 83.00","N","R",1,2.59,37740,485.38,"ROAD RD","ROAD RD REAR","2.593 ACRES","","PLEASANT TOWNSHIP",0," ",0,0,0,0," ","","","0",""," "," .0",0,"1"," 0"

 PostPosted: Mon Oct 05, 2009 7:08 pm   

Joined: Mon Oct 05, 2009 6:31 pm
Posts: 6
This is indeed a pickle you have here. The only thing I could suggest is to look for some sort of identifiable pattern for the offending commas and try to filter them out using sed or some such line editor first.

Just using the example you provided, there is a space after the offending comma "DES MOINES, IOWA". I would expect that a comma delimiter would be followed by other than a whitespace character, unless there are blank fields that hold whitespace in which case that shoots that theory all to hell :-\

If not, then a simple search & replace for ', ' should do the trick for you. HTH, BOL!

Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 2 posts ] 

All times are UTC - 6 hours

Who is online

Users browsing this forum: No registered users and 10 guests

You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Jump to:  

BashScripts | Promote Your Page Too
Powered by phpBB © 2011 phpBB Group