|
I wrote a PYTHON program "A" to call a shell script "B". Script B will execute another program "C"(C/C++/JAVA/etc) which will take a long time,e.g. such as output large data to STD_OUT. Program "A" will teminate process "B". When this happens, process "B" will terminate its subshelll process "C". I tried to use signal trap in shell script "B". But signal can not be caught in some scenarios, e.g. TESTCASE2. Followings are my testcases. The test was executed on RedHat Enterprise Linux 5.4 (Tikanga), linux kernel 2.6.18, 32bit, bash, version 3.2.25.
(1) TESTCASE1: (1.1) write testtrap1.sh:(program "B") ========================== #!/bin/sh # env PATH=.:/bin:/usr/bin:/usr/local/bin:$HOME/bin export PATH
#signal handle function function my_exit( ) { echo "signal caught. call my_exit()" exit 1 }
#catch signal SIGTERM, SIGINT trap "my_exit" 15 2 echo "trapReturnCode="$?
#program "C" while( true ); do echo "..." sleep 1 done ==========================
(1.2)run it in a SSH console --------------------------------- $ ./testtrap1.sh trapReturnCode=0 ... ... ... ... ---------------------------------
(1.3) send SIGTERM signal (kill) to it in another SSH console --------------------------------- $ ps -ef|grep testtrap1.sh user 15324 14147 0 15:31 pts/14 00:00:00 /bin/sh ./testtrap1.sh user 15528 14298 0 15:32 pts/15 00:00:00 grep testtrap1.sh $ pstree -p 15324 testtrap1.sh(15324)---sleep(15592) $ kill 15324 $ ps -ef|grep 15324 user 15695 14298 0 15:34 pts/15 00:00:00 grep 15324 $ ps -ef|grep 15592 user 15697 14298 0 15:34 pts/15 00:00:00 grep 15592 --------------------------------- process testtrap1.sh and its subshell sleep terminate. In previous SSH conosle: --------------------------------- ... ... ... signal caught. call my_exit() $ --------------------------------- trap caught the TERM signal and called signal handle function successfully.
(2) TESTCASE2: (2.1) write testtrap2.sh:(program "B"), command cat for program "C" ========================== #!/bin/sh # env PATH=.:/bin:/usr/bin:/usr/local/bin:$HOME/bin export PATH
#signal handle function function my_exit( ) { echo "signal caught. call my_exit()" exit 1 }
#catch signal SIGTERM, SIGINT trap "my_exit" 15 2 echo "trapReturnCode="$?
#program "C" cat ==========================
(2.2)run it in a SSH console --------------------------------- $ ./testtrap2.sh trapReturnCode=0
--------------------------------- process on holds, awaiting for input
(2.3) send SIGTERM signal (kill) to it in another SSH console --------------------------------- $ ps -ef|grep testtrap2.sh user 15774 14147 0 15:43 pts/14 00:00:00 /bin/sh ./testtrap2.sh user 15877 14298 0 15:47 pts/15 00:00:00 grep testtrap2.sh $ pstree -p 15774 testtrap2.sh(15774)---cat(15775) $ kill 15774 $ pstree -p 15774 testtrap2.sh(15774)---cat(15775) --------------------------------- process testtrap2.sh and its subshell are still runnig. process can not catch the SIGTERM signal, Here appears In previous SSH conosle: --------------------------------- $ ./testtrap2.sh trapReturnCode=0
--------------------------------- trap catch the TERM signal fail!!! Why????????
(3)TESTCASE3( continue to TESTCASE2): (3.1)Press CTRL+C, send SIGINT signal in previous SSH cosole: --------------------------------- $ ./testtrap2.sh trapReturnCode=0 signal caught. call my_exit() $ --------------------------------- trap catch SIGINT and call signal handle function normally and successfully in session control console.
(4)TESTCASE4: (4.1) rerun testtrap2.sh in a SSH console: --------------------------------- $ ./testtrap2.sh trapReturnCode=0
--------------------------------- process hold on, awaiting for input
(4.2) send SIGTERM signal to subshell process in other SSH console: --------------------------------- $ ps -ef|grep testtrap2.sh user 16017 14147 0 16:08 pts/14 00:00:00 /bin/sh ./testtrap2.sh user 16052 14298 0 16:11 pts/15 00:00:00 grep testtrap2.sh $ pstree -p 16017 testtrap2.sh(16017)---cat(16018) $ kill 16018 $ pstree -p 16017 $ ps -ef|grep cat user 16072 14298 0 16:15 pts/15 00:00:00 grep cat --------------------------------- sub process "cat" and its parent process "testtrap2.sh" are terminated as expected.
But sometimes subprocess "cat" ( or parent process "testtrap2.sh" ??? ) caught the sub process SIGTERM and calll signal handle function in privious SSH console: --------------------------------- $ ./testtrap2.sh trapReturnCode=0 Terminated signal caught. call my_exit() $ --------------------------------- And sometimes are not: --------------------------------- $ ./testtrap2.sh trapReturnCode=0 Terminated $ --------------------------------- This is what puzzled me, I guess bash shell uses "fork() -- exec()" model to execute program "C" ( cat ), when exec() in sub process, signal handle should be restored as SIG_DFL or SIG_IGNORE, instead of signal handle function of parent process. Would you tell me why this happened from time to time?
(5)TESTCASE5: (5.1) rerun testtrap2.sh in a SSH console: --------------------------------- $ ./testtrap2.sh trapReturnCode=0
--------------------------------- process on hold, awaiting for input
(5.2) send SIGKILL signal to parent process in other SSH console: --------------------------------- $ ps -ef|grep testtrap2.sh user 16174 14147 0 16:32 pts/14 00:00:00 /bin/sh ./testtrap2.sh user 16179 14298 0 16:32 pts/15 00:00:00 grep testtrap2.sh $ pstree -p 16174 testtrap2.sh(16174)---cat(16175) $ ps -ef|grep 16174 user 16184 14298 0 16:33 pts/15 00:00:00 grep 16174 $ ps -ef|grep 16175 user 16175 1 0 16:32 pts/14 00:00:00 cat user 16186 14298 0 16:33 pts/15 00:00:00 grep 16175 --------------------------------- Parent process "testtrap2.sh" was killed and its subprocess "cat" belongs to init deamon as expected.
Below are what I captured in previous SSH conosle : --------------------------------- $ ./testtrap2.sh trapReturnCode=0 Killed $ ---------------------------------
I'm crazy now. Could someone please help me figure out why trap failed to catch signal in TESTCASE2, and the behavior in TESTCASE4? Thanks in advance.
|