Register
It is currently Sun Dec 21, 2014 3:53 pm

Script to pull a specific set of words out of a text file


All times are UTC - 6 hours


Post new topic Reply to topic  [ 4 posts ] 
Author Message
 PostPosted: Sat Jul 27, 2013 7:29 am   

Joined: Fri Jul 19, 2013 5:17 pm
Posts: 11
I'm having an issue trying to get the first word from a set of specific lines in a text file to copy to a 2nd text file.
I've looked at sed and awk, but not finding a solution, any ideas??


Top
 Profile  
 PostPosted: Sat Jul 27, 2013 10:43 am   

Joined: Mon Mar 02, 2009 3:03 am
Posts: 579
hi,

what's the input?
what's the desired output?
what have you really tried so far?


Top
 Profile  
 PostPosted: Tue Jul 30, 2013 6:09 am   

Joined: Mon Mar 02, 2009 3:03 am
Posts: 579
LHolcomb posted:
Quote:
The input is a an output file of a lynx dump that has been pared down to show the top 25 movers of stock for a given trading day on the nasdaq stock exchange. What I need is just the stock symbols, or literally the first three, sometimes four sometimes five characters for the stock symbol. So thats a variable, heres some sample output.

INPH Interphase Corporation 4.60 Up 1.57 176,052 Chart, Profile
Jul 26 (51.82%) , More
As you can see I get two lines of code. All I need from this output is the stock symbol INPH the rest is of no interest to me. Also the output needs to be redirected to a 3rd text file for further

Heres the initial script:
Code:
#!/bin/bash

#movers
#EXTRA OPTIONS
uagent="firefox/22.0" #user agent (fake a browser)
sleeptime=01 #add pause between requests
{
w3m -dump "http://finance.yahoo.com/gainers?e=o" -T text/html >>'/home/user/Desktop/movers.txt'

#INITIAL PAGE
echo "[+] Fetching" && sleep $sleeptime
initpage=`curl -s -b "cookie.txt" -c "cookie.txt" -L --sslv3 -A "$uagent" "http://finance.yahoo.com/gainers?e=o"`
token=`echo "$initpage" | grep "authenticity_token" | sed -e 's/.*value="//' | sed -e 's/" \/>.*//'`

/bin/bash /home/user/Desktop/moverparsingscript
}
rm "cookie.txt"


Here is the sed script that pares down the output;
Code:
#!/bin/bash

sed -e '1,132d' /home/user/Desktop/movers.txt > /home/user/Desktop/parse1.txt

sed -e '50,211d' /home/user/Desktop/parse1.txt > /home/user/Desktop/parse2.txt



rm /home/user/Desktop/parse1.txt

rm /home/user/Desktop/movers.txt

This works pretty well and cleans up the extra files after it finishes up

I've been playing with this piece of code seeing if I can get it to parse the file for me but I'm grasping in the dark at this point

Code:
${VAR ##pattern }'/home/user/Desktop/parse2.txt' > '/home/user/Desktop/parse3.txt'


Any Help would be greatly Appreciated

Lonnie


Top
 Profile  
 PostPosted: Tue Jul 30, 2013 6:12 am   

Joined: Mon Mar 02, 2009 3:03 am
Posts: 579
I'm not sure I understand everything:
Code:
w3m -dump "http://finance.yahoo.com/gainers?e=o" -T text/html 2>/dev/null |sed -n '/^Symbol/,/^Get a Free/{/^\(Symbol\|Get a Free\)/d;1~2d;s/\([^[:blank:]]*\)[[:blank:]].*/\1/p}'
RELV
SNTA
SPEX
INPH
CZFC
HDSN
NVEEU
PRAN
CREG
PBIB
SYNM
WAVX
LPTN
MBLX
PEIX
MWIV
APPY
ARWR
OMED
NWBO
OCZ
GCBC
KONE


Top
 Profile  
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 4 posts ] 

All times are UTC - 6 hours


Who is online

Users browsing this forum: Bing [Bot] and 1 guest


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Jump to:  


BashScripts | Promote Your Page Too
Powered by phpBB © 2011 phpBB Group
© 2003 - 2011 USA LINUX USERS GROUP