Register
It is currently Fri Oct 31, 2014 2:46 pm

replacing substrings in variables


All times are UTC - 6 hours


Post new topic Reply to topic  [ 8 posts ] 
Author Message
 PostPosted: Sat Dec 06, 2008 8:08 am   

Joined: Fri Dec 05, 2008 11:01 am
Posts: 6
Hi.

I'm new to bash and new to this forum.

My question in short is can the "for" command take a regular expression to describe files you wish to iterate through? For example...

for a in "^[0-9]*_.*"; do echo $a; done

...where ^[0-9]*_ above is meant to be a regular expression representing any filenames that starts with a group of digits and then an underscore, and then anything at all.

I have a directory with a large number of files named like "183729872_real_file_name.ext". The number of digits in the leading number varies. With a for loop I want to iterate through just these files and operate on them. I was wondering if using "for" in the above described manner will help.

Any thoughs? Suggestions?

Thanks, and I'm glad there's a forum like this out there!

John


Top
 Profile  
 PostPosted: Mon Dec 08, 2008 6:51 am   
Moderator
User avatar

Joined: Thu Oct 11, 2007 7:12 am
Posts: 229
Location: London - UK
Sadly you cant glob files with regular expressions, although you may be able to achieve the same using globs.

This should be able to directly drop into your script though...

Code:
for a in $(ls | egrep '^[0-9]*_.*'); do echo $a; done


Top
 Profile  
 PostPosted: Mon Dec 08, 2008 7:36 am   

Joined: Mon Nov 17, 2008 7:25 am
Posts: 221
You can actually do some regexps with the shell... [0-9] works atleast.

But agreed... if you're looking for more serious regexps you need to do something like DarthWavy suggests.

Problem is that doing it with a for loop (as i've probably said alot before :P) is the handling of files with a space character :/

I use 2 ways to counter this effect. One is replacing spaces with something obscure that shouldn't show up anywhere, or try out the IFS variable.
Problem with using IFS is that some text lines doesn't really contain \n, some uses \r and some lines just gets garbled.

doing something like this should work: for a in $(ls | egrep '^[0-9]*_.*' | replace " " "::::"); do echo $a | replace "::::" " "; done


Top
 Profile  
 PostPosted: Tue Dec 09, 2008 5:00 am   
Moderator
User avatar

Joined: Thu Oct 11, 2007 7:12 am
Posts: 229
Location: London - UK
using a different type of loop can help with the spaces issue;

Code:
ls | egrep '^[0-9]*_.*' | while read a ; do echo $a ; done


As i generally work exclusively with Linux and it's more rare to encounter spaces in filenames I often forget :)


Top
 Profile  
 PostPosted: Tue Dec 09, 2008 5:46 am   

Joined: Mon Nov 17, 2008 7:25 am
Posts: 221
DarthWavy wrote:
using a different type of loop can help with the spaces issue;

Code:
ls | egrep '^[0-9]*_.*' | while read a ; do echo $a ; done


As i generally work exclusively with Linux and it's more rare to encounter spaces in filenames I often forget :)


Yes I know, but lately it's been quite the occurance for me to encounter spaced filenames both in work and private :P
I must say it's quite annoying thou, I like my files unspaced and nice :P


Top
 Profile  
 PostPosted: Wed Dec 10, 2008 5:04 am   

Joined: Fri Dec 05, 2008 11:01 am
Posts: 6
Thanks for your helpful replies.

Actually these files came originally from a Windows system. There are lots of spaces in the filenames, along with apostrophes and commas and parenthesis and pretty much everything the OS allowed the users to name them with.

So given that, it seems like "while" variant below is my best choice. Otherwise I could end up doing a large number of the double replacements described by fredrik for all those wacky characters, no?

Anyway, I tried the while version and it gives me the results I hoped for.

Thanks again!


Top
 Profile  
 PostPosted: Wed Dec 10, 2008 5:37 am   

Joined: Mon Nov 17, 2008 7:25 am
Posts: 221
Well you could add something like this to a sed,

sed -e "s/\([[:punct:]]\)/\\$1/g"

Dunno how " would react to that, but should work (same problem will occur with ' if you use that for encapsulation) :P
What this will do is to add a \ before every special characters and therefor making them "normal" characters that the shell won't interpret as something else.

Best regards
Fredrik Eriksson


Top
 Profile  
 PostPosted: Wed Dec 10, 2008 7:50 am   

Joined: Fri Dec 05, 2008 11:01 am
Posts: 6
Thanks. Sed is next on my to-learn list.


Top
 Profile  
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 8 posts ] 

All times are UTC - 6 hours


Who is online

Users browsing this forum: No registered users and 5 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Jump to:  


BashScripts | Promote Your Page Too
Powered by phpBB © 2011 phpBB Group
© 2003 - 2011 USA LINUX USERS GROUP