Register
It is currently Sun Apr 20, 2014 11:04 pm

help with getting text using grep/ sed/ awk


All times are UTC - 6 hours


Post new topic Reply to topic  [ 5 posts ] 
Author Message
 PostPosted: Wed Jun 01, 2011 7:21 am   

Joined: Tue Apr 19, 2011 11:01 pm
Posts: 36
Hello,

I have an .html file that is generated from a reporting tool I use. I am writing a bash script that does various things and one of them is to grab the certain values from the .html report.

There is a lot of html code in the file (/tmp/report.html), but I want to zero in on this part:
Code:
<tr>
      <th class="group" align="right">Total</th>
      <td class="numeric bold">50</td>
      <td class="numeric bold">25</td>
      <td class="numeric bold">0</td>
      <td class="numeric bold">50.0</td>
      <td class="numeric bold">75.0</td>
      <td class="numeric bold">70%</td>
</tr>


I want to grab just the 50, 25, 0, 50.0, 75.0, and 70% (lined up vertically in the output if all possible)

When I run this command string:
Code:
grep -A1 "Total" /tmp/report.html | awk `{print $3}

I get this output:
Code:
align="right">Total</th>
bold">50</td>
bold">25</td>
bold">0</td>
bold">50.0</td>
bold">75.0</td>
bold">70%</td>


As you can see, the word "Total" is the first and only time it is used in the html code, so I was thinking that would be where I would start. The numeric values will change from report to report, so I can't start with the "50" value, per se. I also tried following some of the examples from this thread, but to no avail. viewtopic.php?f=21&t=1319

Sorry, I am not well versed in using sed and awk, so any help would be great!


Top
 Profile  
 PostPosted: Wed Jun 01, 2011 8:34 am   
User avatar

Joined: Tue Apr 27, 2010 2:28 pm
Posts: 172
Location: Czech Republic
Your specification is a bit unclear, but this can help you:
Code:
echo $(grep -A6 Total /tmp/report.html | grep numeric | cut -f2 -d\> | cut -f1 -d\<)


Top
 Profile  
 PostPosted: Wed Jun 01, 2011 10:12 am   

Joined: Mon Mar 02, 2009 3:03 am
Posts: 512
Hi,

I'd rather use sed like this :
Code:
cat File
<tr>
      <th class="group" align="right">Total</th>
      <td class="numeric bold">50</td>
      <td class="numeric bold">25</td>
      <td class="numeric bold">0</td>
      <td class="numeric bold">50.0</td>
      <td class="numeric bold">75.0</td>
      <td class="numeric bold">70%</td>
</tr>
sed '/numeric/!d;s/[^>]*>\([^<]*\).*/\1/' File
50
25
0
50.0
75.0
70%


Top
 Profile  
 PostPosted: Wed Jun 01, 2011 10:29 am   

Joined: Tue Apr 19, 2011 11:01 pm
Posts: 36
Thank you both for the replies. (and I wrote the wrong thing in the opening thread, I needed them to be horizontally vs. vertically...my bad)

choroba, that was exactly was I was looking for. Watel, yours worked too though.

Thanks again.


Top
 Profile  
 PostPosted: Wed Jun 08, 2011 9:58 am   
User avatar

Joined: Wed Jun 08, 2011 8:27 am
Posts: 189
Location: outer Shpongolia
With awk(1) :

Code:
awk -F '[<>]' '$3 && $3 != "Total" { print $3 }' /tmp/report.html


Top
 Profile  
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 5 posts ] 

All times are UTC - 6 hours


Who is online

Users browsing this forum: No registered users and 2 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Jump to:  


BashScripts | Promote Your Page Too
Powered by phpBB © 2011 phpBB Group
© 2003 - 2011 USA LINUX USERS GROUP