Register
It is currently Wed Nov 26, 2014 7:49 am

time code reformat


All times are UTC - 6 hours


Post new topic Reply to topic  [ 9 posts ] 
Author Message
 PostPosted: Sat Jul 04, 2009 2:54 pm   

Joined: Thu Oct 09, 2008 3:26 am
Posts: 15
Location: Columbus, OH
I've searched for time and reformat and can't find anything, so here it goes. I need a script to reformat hourly time in a log file from 4 digit (hhmm) to 2:2 digit (hh:mm). Awk might have the answer, but I'm not that good (yet =)) )


Top
 Profile  
 PostPosted: Mon Jul 06, 2009 12:46 am   
User avatar

Joined: Sat Jun 13, 2009 8:53 pm
Posts: 73
Location: Texas!
Well, if all of your log messages follow a pattern, then awk might work well. I don't know about the log files, so I don't know how many four digit numbers show up, but this sed script will catch all the ones that could be a hhmm time. Make sure that you don't redirect the output to the same file! It will be erased.
Code:
sed -e 's/\([0-2][0-9]\)\([0-5][0-9]\)/\1:\2/g' file > file.temp


Top
 Profile  
 PostPosted: Mon Jul 06, 2009 1:09 am   

Joined: Mon Nov 17, 2008 7:25 am
Posts: 221
You could do that, but does it matter if the time is not supposed to be 25:60?

Code:
sed -e 's/\([0-9]\{2\}\)/\1:/' file > new.file


Top
 Profile  
 PostPosted: Mon Jul 06, 2009 5:35 am   

Joined: Thu Oct 09, 2008 3:26 am
Posts: 15
Location: Columbus, OH
Thank you very much for the help. I'll give both methods a try later today. Fyi, the log files are coming from the Space Weather Prediction Center's 1 minute update from the GOES 10 satellite 1 minute x-ray flux that measures solar flare output. Now, I just have to figure out how to trip a threshold alert and trap redundant trips. But I think I can handle that. Again, thx!!!!!
http://www.swpc.noaa.gov/ftpdir/lists/xray/G10xr_1m.txt

example truncated for brevity:

    :Data_list: G10xr_1m.txt
    :Created: 2009 Jul 06 1123 UTC
    # Prepared by the U.S. Dept. of Commerce, NOAA, Space Weather Prediction Center
    # Please send comments and suggestions to SWPC.Webmaster@noaa.gov
    #
    # Label: Short = 0.05- 0.4 nanometer
    # Label: Long = 0.1 - 0.8 nanometer
    # Units: Short = Watts per meter squared
    # Units: Long = Watts per meter squared
    # Source: GOES-10
    # Location: W061
    # Missing data: -1.00e+05
    #
    # 1-minute GOES-10 Solar X-ray Flux
    #
    # Modified Seconds
    # UTC Date Time Julian of the
    # YR MO DA HHMM Day Day Short Long
    #-------------------------------------------------------
    2009 07 06 1117 55018 40620 3.77e-09 9.46e-09
    2009 07 06 1118 55018 40680 3.75e-09 9.46e-09
    2009 07 06 1119 55018 40740 3.77e-09 9.47e-09
    2009 07 06 1120 55018 40800 3.75e-09 1.10e-08
    2009 07 06 1121 55018 40860 5.50e-09 2.59e-08
    2009 07 06 1122 55018 40920 5.68e-09 3.59e-08


Top
 Profile  
 PostPosted: Mon Jul 06, 2009 5:59 am   

Joined: Thu Oct 09, 2008 3:26 am
Posts: 15
Location: Columbus, OH
Reporting back. The first method works on the time format like I'd hoped, the second doesn't. And both hit on the 4-digit year in the first column. (should've included output example to begin with). Also, first method also hits on the sixth (seconds of the day) column. Is there a way to have sed operate only on the 4th? The header will be stripped by time (no pun) I employ the script. That's why I think awk with it's column oriented operation would be of use.

Again, thanks for all the help.


Top
 Profile  
 PostPosted: Mon Jul 06, 2009 6:24 am   

Joined: Mon Nov 17, 2008 7:25 am
Posts: 221
Okey, I might have thought that the time would be sorted in the first column of your line :)

But this is a monster regexp that does the job. It will only edit that particular section (and only if the start of the HHMM part is 11 characters in).
Code:
K0521# sed -e "s/^\(.\{11\}\)\([0-9]\{2\}\)\([0-9]\{2\}\)\(.*\)$/\1\2:\3\4/" tmp.txt
2009 07 06 11:17 55018 40620 3.77e-09 9.46e-09
2009 07 06 11:18 55018 40680 3.75e-09 9.46e-09
2009 07 06 11:19 55018 40740 3.77e-09 9.47e-09
2009 07 06 11:20 55018 40800 3.75e-09 1.10e-08
2009 07 06 11:21 55018 40860 5.50e-09 2.59e-08
2009 07 06 11:22 55018 40920 5.68e-09 3.59e-08


Also it cut's up the first section (2009 07 06) into \1 and then splits hour from minutes and adds the colon. And then stores the rest of the string in \4 (this is actually not needed since you're probably won't modify that part all that much :)

Best regards
Fredrik Eriksson


Top
 Profile  
 PostPosted: Mon Jul 06, 2009 6:26 am   

Joined: Mon Nov 17, 2008 7:25 am
Posts: 221
When it comes to threshholding i would recommend using AWK.

Code:
value1=$(cat file.txt | awk {'print $5'})
value2=$(cat file.txt | awk {'print $6'})


Will put the value of column 5 and 6 in value1 and value2


Top
 Profile  
 PostPosted: Mon Jul 06, 2009 2:10 pm   

Joined: Thu Oct 09, 2008 3:26 am
Posts: 15
Location: Columbus, OH
Much better. As far as the threshold goes, I thought about awk for combining the two columns into a one column list and then using sort. After that, I'd use grep to check for value points of e-04 and greater. (i.e. M class flares and better) Then using the returned highest value to re-grep the original file to retrieve the date, time, etc.... If the threshold is hit, awk the date/time/flare class into an alert registry (text) file so if the system crashes, upon reboot, the recipients won't get a flood of alerts re-issued. Also, I'll have to figure out how to handle values greater than 10^3 (or ^2 I think) since they're logged as a decimal value instead of an exponent. This might actually have to be done with 5 (or so) grep operations with a check on the alert file for values exceeding the previous alert, but I have quite a bit of time before the next solar max to get this done which should be about 2012. Yeah, plenty of time :))

Again, thx for not only the help, but with increasing my understanding of sed. I've studied it for a while but this really helps my understanding. I'm sure you've heard it before, but I didn't know you could do that with sed.


Top
 Profile  
 PostPosted: Wed Jul 08, 2009 6:41 am   

Joined: Mon Nov 17, 2008 7:25 am
Posts: 221
Why awk and sort and then use grep?

Sort can sort by columns also :)
Code:
hanna:~> sort -k8 tmp.txt
2009 07 06 11:20 55018 40800 3.75e-09 1.10e-08
2009 07 06 11:21 55018 40860 5.50e-09 2.59e-08
2009 07 06 11:22 55018 40920 5.68e-09 3.59e-08
2009 07 06 11:17 55018 40620 3.77e-09 9.46e-09
2009 07 06 11:18 55018 40680 3.75e-09 9.46e-09
2009 07 06 11:19 55018 40740 3.77e-09 9.47e-09

Where -k# is which column to sort after. And you can use -r to reverse the order.

So isn't this an easier method :) now you can awk the value you want in the proper order and instead of doing X-number of grep's the get the same result :)

Best regards
Fredrik Eriksson


Top
 Profile  
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 9 posts ] 

All times are UTC - 6 hours


Who is online

Users browsing this forum: No registered users and 13 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Jump to:  
cron


BashScripts | Promote Your Page Too
Powered by phpBB © 2011 phpBB Group
© 2003 - 2011 USA LINUX USERS GROUP