gawk command not found csh script loop iteration ubuntu desktop x64 10.04

Asked by José de la Bastida

I am using Ubuntu Desktop x64 10.04 on a HP Proliant ML350 G6 Server.
I am executing a simple csh script which uses gawk to process some strings.
Inside the script there is a loop section, which uses gawk repeteadly.
After "some sucessful executions" i recieve a "GAWK: COMMAND NOT FOUND" Error Message and the script crashes.

I don't undertand why at the begining of the execution everything goes fine, but after some iterations the "GAWK" Command "DISAPPEARS".

The next csh script ilustrates my problem:

#! /bin/csh -f
echo $0 ">>" `date`
echo $0 ">>" `date` > reporte
set c = 1
set cc = 0
while ($c < 5)
    @ cc++
    set a = `echo hola | gawk '{print $1}'`
    if ($a == "") then
 echo error $c paso $cc
 echo error $c paso $cc >> reporte
 @ c++
    endif
end
echo $0 ">>" `date`
echo $0 ">>" `date` >> reporte

Here is the outpu for one execution:

ubuntu-test.csh >> Wed Jan 19 11:19:17 ECT 2011
error 1 paso 23429
error 2 paso 23913
error 3 paso 26982
error 4 paso 29193
ubuntu-test.csh >> Wed Jan 19 11:20:07 ECT 2011

Basically, this script counts the number of iterations needed to produce the "ERROR". It's interesting to notice that "GAWK" disappears for a while, but re-appears after some other iterations.

It's a real problem for me because I use gawk in many scripts for many tasks. I have used Suse and CentOS distributions before with the same script and I have never got this problem. I really want to use Ubuntu because has a lot of features i need.

I hope you can help me.
Thanks in advance.

Att, Jose.

Question information

Language:
English Edit question
Status:
Solved
For:
Ubuntu gawk Edit question
Assignee:
No assignee Edit question
Solved by:
José de la Bastida
Solved:
Last query:
Last reply:
Revision history for this message
mycae (mycae) said :
#1

I think you need more info. I would use strace on the script, then terminate it in your if statement. BAcking that up, you should be able to grep the strace output for the gawk command search, and see what is going on.

Revision history for this message
mycae (mycae) said :
#2

Also, you might want to see if converting the test to BASH, rather than csh, changes the results.

Revision history for this message
Ralph Corderoy (ralph-inputplus) said :
#3

I have managed to re-create your problem on, like you, 10.04. I suspect it's more of a csh problem than gawk. I'm still investigating but it would be interesting to know if the problem remains for you if you change gawk to mawk. Another test would be to install package tcsh and alter your test script to be /bin/tcsh. If the fault doesn't occur with either of these changes then it suggests csh from package csh is the problem.

Revision history for this message
José de la Bastida (jdelabastida) said :
#4

Thank you very much everyone :-) ... I will tray right now what you have suggested me and I will tell you what happened.

Revision history for this message
Ralph Corderoy (ralph-inputplus) said :
#5

I've been able to prove that gawk can be substituted for another command, e.g. sed(1), and the problem continues. Also, using inotifywait(1), that the executable used, e.g. gawk or sed, isn't being open(2)ed by csh on the occasions the backticks fail to work. Please bump this onto being a csh question rather than a gawk one if possible. I suspect csh's use of vfork(2) is the problem; it does too much compared to what vfork's Ubuntu man page suggests.

Revision history for this message
José de la Bastida (jdelabastida) said :
#6

I have good news. I verified my script using tcsh. My script run without errors (>tsch myscript). So, I decided to uninstall csh package, keeping only tsch package in my Ubuntu. I tested my script again using (>csh myscript) and it run without errors again.

Apparently this error occured because i had this two packages (csh and tsch) installed at the same time.

I am not sure why this occured, but my problem has been solved. Thanks everyone for your support.

Revision history for this message
Ralph Corderoy (ralph-inputplus) said :
#7

It wasn't having both csh and tcsh packages installed that caused the problem. When you have both installed then /bin/csh is a symbolic link to one or the other and can be updated with `update-alternatives --config csh'. The default is bsd-csh from the csh package. When you remove the csh package the /bin/csh link was made to point to the shell from the tcsh package instead.

It appears to be a bug in bsd-csh from the csh package. In texec() the call to execve(2) returns ENOENT. Printing the parameters passed to execve(2) immediately beforehand looks fine. I haven't time to investigate further at the moment.

Revision history for this message
mycae (mycae) said :
#8

The only thing I can think of is that there is a file descriptor leak. Perhaps lsof will show this?

Revision history for this message
Ralph Corderoy (ralph-inputplus) said :
#9

csh closes the first 1,024 file descriptors, whether they're open or not, soon after starting and I don't see it doing anything after that which would suggest it's rising sufficient to exhaust them. I think it needs more detail from the kernel as to why execve is failing on the odd occasion. If you strace or ltrace csh then the problem goes away.