Register
It is currently Thu Nov 27, 2014 11:35 pm

grep search within xml


All times are UTC - 6 hours


Post new topic Reply to topic  [ 6 posts ] 
Author Message
 PostPosted: Wed Jan 28, 2009 2:21 am   

Joined: Wed Jan 28, 2009 2:16 am
Posts: 3
I have a bunch of folders with files buried inside, all having the same name. I'm currently cycling through each folder and finding the file itself.

Code so far;

Code:
for F in `find /var/mobile/Applications/ -type f -name 'Info.plist' -print`; do Code=$(grep -e "<string>[0-9.([0-9]|[0-9].[0-9])]</string>" "$F" | awk -FY '{print $2}') echo $Code ; done


I'm trying to search the file and select the text in bold
Quote:
<key>CFBundleVersion</key>
<string>2.0</string>


The problem is, there are many <string> containers which also have digits seperated by commas, I'm only looking for the line below CFBundleVersion

Does anyone have any idea for the grep command I can use to pick out the text I need and echo it within the for loop?

Thanks a lot
Michael


Top
 Profile  
 PostPosted: Wed Jan 28, 2009 3:06 am   

Joined: Mon Nov 17, 2008 7:25 am
Posts: 221
You can do something like this :)

Code:
#!/bin/bash
pattern="<key>CFBundleVersion</key>"

for i in $(find /var/mobile/Applications/ -type f -name 'Info.plist'); do
   key=$(cat -n $i | grep "$pattern")
   line=$(echo $key | awk {'print $1'})
   let line=line+1
   string=$(head -n$line $i | tail -n1)
   string=$(echo $string | sed /^<string>(\d\+\.\d\+)<\/string>$/\1/)
   echo "$i contains $string key value"
done


To explain what I've just done, first "cat -n" gives a number per line, then we just cut out only the line number with awk, add +1 to current line number to end up on the next line which should contain your <string> (this can ofcourse be tested with some if statements).
Then we use head to list everything up till the line we want and use tail to cut away the above part.
Then I did some line formating for you :)

This is useful only if you have no idea which line number your looking for and if you have this problem with too many <string>decimalnumber</string>.

If the line will always be at the same position (line number) in the files you can use sed to just print that specific line.
Code:
sed '4p' file

I believe that will print the 4th line in file, there is information on google about this :)

Best regards
Fredrik Eriksson


Top
 Profile  
 PostPosted: Wed Jan 28, 2009 2:49 pm   

Joined: Wed Jan 28, 2009 2:16 am
Posts: 3
I seem to get an error with the code saying "unknown command d+.d+"

I've got a little further by using ;

Code:
#!/bin/bash
pattern="<key>CFBundleVersion</key>"

for i in $(find /var/mobile/Applications/ -type f -name 'Info.plist'); do
   appDirName=`dirname "$i"`
   appName=`echo "$appDirName" | cut -d '/' -f 6`
   string=$(grep -A 1 '<key>CFBundleVersion</key>' "$i" )
   echo "$appName - $string"
done


this outputs what I want but there are still two issues, one is that some of the directories have spaces in their path, which causes the grep command to error when it comes to the space.

The other is the fact that $string is shown as;

Quote:
Folder2 - <key>CFBundleVersion</key>
<string>1.0</string>


When I simply want the 1.0 to show, not the complete two lines.

Thanks for your help Fredrik.


Top
 Profile  
 PostPosted: Wed Jan 28, 2009 4:18 pm   

Joined: Mon Nov 17, 2008 7:25 am
Posts: 221
Code:
#!/bin/bash
pattern="<key>CFBundleVersion</key>"

for i in $(find /var/mobile/Applications/ -type f -name 'Info.plist'); do
   key=$(cat -n $i | grep "$pattern")
   line=$(echo $key | awk {'print $1'})
   let line=line+1
   string=$(head -n$line $i | tail -n1)
   string=$(echo $string | sed "/^<string>(\d\+\.\d\+)<\/string>$/\1/")
   echo "$i contains $string key value"
done


Sorry, I forgot to quote the sed line :)

Best regards
Fredrik eriksson


Top
 Profile  
 PostPosted: Wed Jan 28, 2009 4:26 pm   

Joined: Wed Jan 28, 2009 2:16 am
Posts: 3
Seems to spit out this;

Quote:
sed: -e expression #1, char 35: unknown command: `\'


Also, there is still the issue with directory paths that have spaces in for folder names;
Quote:
head: cannot open


Top
 Profile  
 PostPosted: Thu Jan 29, 2009 2:33 am   

Joined: Mon Nov 17, 2008 7:25 am
Posts: 221
Encase everything in quotes ;P
Encase everything that is a place where a file or directory name is named with a whitespace.

I was doing it quick'n'dirty so some things slipped by :)

edit: and remove the sed and echo, you can sort that out yourself :)

Best regards
Fredrik Eriksson


Top
 Profile  
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 6 posts ] 

All times are UTC - 6 hours


Who is online

Users browsing this forum: No registered users and 7 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Jump to:  


BashScripts | Promote Your Page Too
Powered by phpBB © 2011 phpBB Group
© 2003 - 2011 USA LINUX USERS GROUP