Register
It is currently Sat Nov 29, 2014 12:13 am

Pick out fields of input text.


All times are UTC - 6 hours


Post new topic Reply to topic  [ 6 posts ] 
Author Message
 PostPosted: Sun Oct 21, 2012 4:20 pm   

Joined: Sun Oct 21, 2012 3:23 pm
Posts: 7
Command is called 'field', below is the output from its help
message followed by the code.

Quote:
$ field --help

Usage: field [-d field-separator] [[-]column-number[+]|-fieldname|text]...

Print given field column(s) from input. Default is last field, -1.
Use a minus number to print counting from the last field.
First field is numbered from 1 not from zero.
Use -<fieldname> to print the value of a named field.
Ordinary text is just copied to output.
Use a plus sign, '+', after a field number to print all following fields.

eg, $ echo yes we can | field 2 3 3 1
we can can yes
$ echo b="item1" bcost="99" | field 'It costs: ' -bcost=
It costs: 99
$ cat /etc/passwd | field -d: userid: 1 shell: -1
userid: guest shell: /bin/sh
$ find . -printf %T+' ' -print |sort|field 2+
--List of files sorted by modified time, oldest first


Code:
#!/bin/bash
#
main() {
    if [[ -t 0 || `regexp ^-?-help$ $1` ]] ;then echo -e $msg;exit 1;fi

    isNaN() {
   if test -z "$*";then return 1;fi
   if regexp "^-?[0-9]+\+?$" "$*";then return 1;else return 0;fi
    }
    #echo main debug col:$col. fsep:$fsep.  arg1:$1.
    local fsep=' '
    if test "$1" = -d;then
   fsep=$2;shift 2;
    else
   if expr match "$1" -d.* >/dev/null; then
       fsep=${1#-d};shift;fi
    fi
    fs=$fsep
    while isNaN $1;do if test ${1:0:1} = -; then break;fi;
   prefix=$prefix"$1 "; shift;
    done
    while read input; do
   #echo -n debug read input as: $input'  '
   if test "$prefix";then echo -n "$prefix";fi
   if test -z $1;then field;fi; #default.
   i=0;
   for col in "$@";do
       fs=$fsep
       if isNaN $col;then
      if expr match "$col" - >/dev/null; then
          fs=${col:1}; col=2;flag=\"\|\'; #is named field
      else echo -n  "$col ";continue; #is just text
      fi
       else #is numeric col(s)
      if [[ $col = *+ ]];then andon=true;col=${col/+};fi
      if test $col -lt 0;then if test $col -eq -1;then
         col='';
          else col=$col+1;fi; col=NF$col;fi
       fi
       field $flag;
       flag=''
       let i=i+1
       if test $# -ne $i;then echo -n ' ';fi 2>/dev/null;
   done #end of: for col in $@
   echo 2>/dev/null # in case of broken pipe.
    done #end of: while read input
}

field() {
    #echo -n field debug fs:$fs.  col:$col.  arg1:$1. input:$input.'  '
    if test -z $col;then col=NF;fi
    echo   "$input"|
    if test -z $andon; then
   if test -z $fs;then
       awk '{printf $('$col')}'
   else
       if test -z $1; then
      awk -F"$fs" '{ printf $('$col') }'
          else
      after=`awk -F"$fs" '{ printf $2 }'`  # >1 field match in? print $3, $4?
      if `regexp "^\ *[\"\']" "$after"`;then
          echo -n "$after" | awk -F"$1" '{ printf $2 }'
      else
          after=${after## }
          echo -n ${after%% *}
      fi
       fi
   fi
    else #$andon is not -z
   tr '\t' ' '|
   cut -d"$fs" -f $col-|
   tr -d "\n"
#     awk -F"$fs" '{ for ( i = '$col' ; i <= NF; i++ ) printf $i" " }'
#   cut -d"'"$fs"'" -f$col-;  ## this doesn't strip spaces from rest of input.
    fi
}

regexp() {
  rexp=$1;shift
  echo "$*"|egrep -q -e "$rexp"
}

msg="\nUsage: field [-d field-separator] [[-]column-number[+]|-fieldname|text]...
\n
\nPrint given field column(s) from input.  Default is last field, -1.  \
\nUse a minus number to print counting from the last field.  \
\nFirst field is numbered from 1 not from zero.\
\nUse -<fieldname> to print the value of a named field.
\nOrdinary text is just copied to output.
\nUse a plus sign, '+', after a field number to print all following fields.
\n\neg,\t\$ echo yes we can | field 2 3 3 1
\n\twe can can yes
\n\t\$ echo b=\"item1\" bcost=\"99\"   | field 'It costs: ' -bcost=
\n\tIt costs: 99
\n\t\$ cat /etc/passwd | field -d: userid: 1  shell: -1\
\n\tuserid: guest shell:  /bin/sh
\n\t$ find . -printf %T+' '  -print |sort|field 2+
\n\t--List of files sorted by modified time, oldest first.
\n"

main $* 2>/dev/null




Last edited by jay on Sun Dec 08, 2013 11:01 am, edited 4 times in total.

Top
 Profile  
 PostPosted: Mon Oct 22, 2012 1:33 am   

Joined: Mon Mar 02, 2009 3:03 am
Posts: 574
hi,

that's nice.

but:
why does bash has to be in posix mode?
main is useless : function are mostly used for repeated code.
instead of testing the whole $* variable, just $1, or $# being 1 or more.
variables in tests should always be quoted.
expr is not bash, you probably can do the same thing using double suare brackets and BAASH_REMATCH
regex for [[ is easier to use in a variable
Code:
reg="^ *[\"']"
[[ $var =~ $reg ]] ...
you see that lhs var doesn't need to be quoted.
instead of using awk, you could read line in an array, based on defined IFS, and print its fields in any order.
use more quotes


Top
 Profile  
 PostPosted: Mon Oct 22, 2012 7:56 am   

Joined: Sun Oct 21, 2012 3:23 pm
Posts: 7
Thanks for your useful comments.

Posix is given since it will run in bash too, it just means that posix shells may also run it. I can't find a general shell script repository on the internet.

Calling main() at the end is useful in that it allows the use of low level functions in the main body. You cannot pre-declare functions in bash.

Yes using $# may be quicker, but how much is a microsecond these days.
Quotes are not needed for "test -z", nor if you know the variable is non-empty.
As to regexp, your way is clearer.
I'm not sure how to use REMATCH but I've not noticed an absence of expr anywhere.

Thanks,J.


Top
 Profile  
 PostPosted: Mon Oct 22, 2012 3:38 pm   

Joined: Mon Mar 02, 2009 3:03 am
Posts: 574
[[ is not posix, so your script won't work with strictly posix shells.
--posix doesn't exactly mimic a posix shell
Quote:
Change the behavior of bash where the default operation differs from the POSIX standard to match the standard (posix mode).
you can't rely on this option to test if your code is strictly posix, better is to use (d)ash.

what do you mean «you can't pre-declare functions in bash» ?
functions can be declared at the top of the script, just like variables.

Code:
$ var="foo bar"
$ test -z $var
bash: test: foo: binary operator expected
$ test -z "$var"
$
see!

`expr` is not shell internal command, so
why use an external command when the shell can do it by itself?


Top
 Profile  
 PostPosted: Tue Oct 23, 2012 12:47 pm   

Joined: Sun Oct 21, 2012 3:23 pm
Posts: 7
Point taken, thanks.

Predeclare is from C programming.


Top
 Profile  
 PostPosted: Sun Mar 10, 2013 2:33 pm   

Joined: Sun Oct 21, 2012 3:23 pm
Posts: 7
Point taken, thanks.

Predeclare is from C programming.


Top
 Profile  
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 6 posts ] 

All times are UTC - 6 hours


Who is online

Users browsing this forum: No registered users and 4 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Jump to:  
cron


BashScripts | Promote Your Page Too
Powered by phpBB © 2011 phpBB Group
© 2003 - 2011 USA LINUX USERS GROUP