代码:
cp from : http://www.fanqiang.com/a4/b2/20021101/060200330.html
在fork()/execve()过程中,假设子进程结束时父进程仍存在,而父进程fork()之前既没安装SIGCHLD信号处理函数调用waitpid()等待子进程结束,又没有显式忽略该信号,则子进程成为僵尸进程,无法正常结束,此时即使是root身份kill -9 也不能杀死僵尸进程。补救办法是杀死僵尸进程的父进程(僵尸进程的父进程必然存在),僵尸进程成为"孤儿进程",过继给1号进程init,init始终会负责清理僵尸进程。
那么出现僵尸进程我们应该做出哪些方面的补救?怎样来找出根源?是apache配置错误(我一启动apache,1分钟不到,就会出现zombie)还是应用程序本身编写问题?
代码:
last pid: 67388; load averages: 0.17, 0.50, 0.49 up 6+22:42:11 17:26:39 46 processes: 1 running, 34 sleeping, 11 zombie CPU states: 0.0% user, 0.0% nice, 0.4% system, 0.0% interrupt, 99.6% idle Mem: 139M Active, 245M Inact, 90M Wired, 20M Cache, 60M Buf, 6136K Free Swap: 512M Total, 244K Used, 512M Free
PID USERNAME PRI NICE SIZE RES STATE TIME WCPU CPU COMMAND 1177 mysql 2 0 106M 54568K poll 549:44 4.30% 4.30% mysqld 66281 nobody 2 0 10592K 9640K sbwait 0:06 0.78% 0.78% httpd 66286 nobody 2 0 9444K 8576K sbwait 0:04 0.39% 0.39% httpd 64692 root 10 0 8416K 7812K nanslp 12:20 0.00% 0.00% perl 100 root 2 0 944K 480K select 0:47 0.00% 0.00% syslogd 66283 nobody 2 0 8036K 7180K accept 0:07 0.00% 0.00% httpd 66282 nobody 2 0 9536K 8576K accept 0:07 0.00% 0.00% httpd 66284 nobody 2 0 9600K 8640K accept 0:06 0.00% 0.00% httpd 66287 nobody 2 0 9472K 8512K accept 0:06 0.00% 0.00% httpd 66291 nobody 2 0 9312K 8432K accept 0:06 0.00% 0.00% httpd 66292 nobody 2 0 9364K 8408K sbwait 0:05 0.00% 0.00% httpd 66285 nobody 2 0 10256K 9380K accept 0:05 0.00% 0.00% httpd 66290 nobody 2 0 9672K 8724K accept 0:05 0.00% 0.00% httpd 66288 nobody 2 0 9132K 8264K accept 0:05 0.00% 0.00% httpd 47801 root 2 0 4348K 3040K select 0:04 0.00% 0.00% httpd 109 root 2 0 2320K 1072K select 0:03 0.00% 0.00% sshd 137 root 2 0 1748K 920K select 0:02 0.00% 0.00% proftpd 107 root 10 0 1012K 516K nanslp 0:02 0.00% 0.00% cron 67378 nobody 2 0 5868K 4932K accept 0:00 0.00% 0.00% httpd 67388 root 28 0 1968K 1060K RUN 0:00 0.00% 0.00% top 67384 root 2 0 2448K 1648K select 0:00 0.00% 0.00% sshd 67383 nobody 2 0 2756K 1912K connec 0:00 0.00% 0.00% sendmail
代码:
ps -aux USER PID %CPU %MEM VSZ RSS TT STAT STARTED TIME COMMAND mysql 1177 9.7 10.5 108128 54568 p0- S Mon10PM 549:46.96 /usr/local/mysql-standard-4.0.12-unknown-freebs nobody 66517 0.0 0.0 0 0 ?? Z 5:05PM 0:00.00 (sh) nobody 66525 0.0 0.0 0 0 ?? Z 5:07PM 0:00.00 (sh) nobody 66673 0.0 0.0 0 0 ?? Z 5:08PM 0:00.00 (sh) nobody 66641 0.0 0.0 0 0 ?? Z 5:08PM 0:00.00 (sh) nobody 66733 0.0 0.0 0 0 ?? Z 5:10PM 0:00.00 (sh) nobody 66740 0.0 0.0 0 0 ?? Z 5:11PM 0:00.00 (sh) nobody 66838 0.0 0.0 0 0 ?? Z 5:13PM 0:00.00 (sh) nobody 66927 0.0 0.0 0 0 ?? Z 5:14PM 0:00.00 (sh) nobody 66954 0.0 0.0 0 0 ?? Z 5:14PM 0:00.00 (sh) nobody 67110 0.0 0.0 0 0 ?? Z 5:19PM 0:00.00 (sh) root 0 0.0 0.0 0 0 ?? DLs Mon06PM 0:00.00 (swapper) root 1 0.0 0.0 552 204 ?? ILs Mon06PM 0:00.03 /sbin/init -- root 2 0.0 0.0 0 0 ?? DL Mon06PM 1:02.04 (pagedaemon) root 3 0.0 0.0 0 0 ?? DL Mon06PM 0:00.00 (vmdaemon) root 4 0.0 0.0 0 0 ?? DL Mon06PM 0:06.69 (bufdaemon) root 5 0.0 0.0 0 0 ?? DL Mon06PM 7:42.06 (syncer) root 6 0.0 0.0 0 0 ?? DL Mon06PM 0:08.19 (vnlru) root 100 0.0 0.1 944 480 ?? Ss Mon06PM 0:46.73 /usr/sbin/syslogd -s root 107 0.0 0.1 1012 516 ?? Is Mon06PM 0:01.82 /usr/sbin/cron root 109 0.0 0.2 2320 1072 ?? Is Mon06PM 0:02.56 /usr/sbin/sshd nobody 137 0.0 0.2 1748 920 ?? Is Mon06PM 0:02.40 proftpd: (accepting connections) (proftpd) root 191 0.0 0.1 948 460 v0 Is+ Mon06PM 0:00.00 /usr/libexec/getty Pc ttyv0 root 192 0.0 0.1 948 460 v1 Is+ Mon06PM 0:00.00 /usr/libexec/getty Pc ttyv1 root 193 0.0 0.1 948 460 v2 Is+ Mon06PM 0:00.00 /usr/libexec/getty Pc ttyv2 root 194 0.0 0.1 948 460 v3 Is+ Mon06PM 0:00.00 /usr/libexec/getty Pc ttyv3 root 195 0.0 0.1 948 460 v4 Is+ Mon06PM 0:00.00 /usr/libexec/getty Pc ttyv4 root 196 0.0 0.1 948 460 v5 Is+ Mon06PM 0:00.00 /usr/libexec/getty Pc ttyv5 root 197 0.0 0.1 948 460 v6 Is+ Mon06PM 0:00.00 /usr/libexec/getty Pc ttyv6 root 198 0.0 0.1 948 460 v7 Is+ Mon06PM 0:00.00 /usr/libexec/getty Pc ttyv7 root 1157 0.0 0.2 1276 848 p0- I Mon10PM 0:00.00 _su (csh) root 1158 0.0 0.1 648 264 p0- I Mon10PM 0:00.01 /bin/sh bin/safe_mysqld --user=mysql root 47801 0.0 0.6 4348 3040 ?? Ss 10:29AM 0:03.59 /usr/local/www1/apache/bin/httpd root 64692 0.0 1.5 8416 7812 ?? Is 4:27PM 12:20.10 /usr/bin/perl -w /usr/local/bin/mrtg mrtgok.cfg nobody 66281 0.0 1.9 10592 9640 ?? S 5:00PM 0:06.28 /usr/local/www1/apache/bin/httpd nobody 66282 0.0 1.7 9536 8576 ?? S 5:00PM 0:06.61 /usr/local/www1/apache/bin/httpd nobody 66283 0.0 1.4 8036 7180 ?? S 5:00PM 0:06.71 /usr/local/www1/apache/bin/httpd nobody 66284 0.3 1.7 9600 8640 ?? S 5:00PM 0:06.66 /usr/local/www1/apache/bin/httpd nobody 66285 0.0 1.8 10256 9380 ?? S 5:00PM 0:05.44 /usr/local/www1/apache/bin/httpd nobody 66286 0.0 1.7 9444 8576 ?? S 5:00PM 0:04.42 /usr/local/www1/apache/bin/httpd nobody 66287 0.0 1.6 9472 8512 ?? S 5:00PM 0:06.58 /usr/local/www1/apache/bin/httpd nobody 66288 0.0 1.6 9260 8304 ?? S 5:00PM 0:05.02 /usr/local/www1/apache/bin/httpd nobody 66290 0.5 1.7 9672 8724 ?? S 5:00PM 0:05.34 /usr/local/www1/apache/bin/httpd nobody 66291 0.0 1.6 9312 8432 ?? S 5:00PM 0:06.26 /usr/local/www1/apache/bin/httpd nobody 66292 0.6 1.6 9364 8408 ?? S 5:00PM 0:05.65 /usr/local/www1/apache/bin/httpd nobody 67382 0.0 0.1 628 256 ?? I 5:25PM 0:00.00 sh -c /usr/sbin/sendmail -t -i nobody 67383 0.0 0.4 2756 1916 ?? S 5:25PM 0:00.02 sendmail: ./h59HPeAw067383 localhost.com.: user root 67384 0.0 0.3 2448 1648 ?? S 5:26PM 0:00.04 sshd: xiyang@ttyp0 (sshd) xiyang 67385 0.0 0.1 636 264 p0 Is 5:26PM 0:00.01 -sh (sh) root 67386 0.0 0.2 1280 828 p0 S 5:26PM 0:00.02 _su (csh) nobody 67389 0.0 0.6 4364 3108 ?? S 5:27PM 0:00.00 /usr/local/www1/apache/bin/httpd nobody 67390 0.0 0.6 4364 3108 ?? S 5:27PM 0:00.00 /usr/local/www1/apache/bin/httpd nobody 67391 0.0 0.6 4364 3108 ?? S 5:27PM 0:00.00 /usr/local/www1/apache/bin/httpd nobody 66305 0.0 0.0 0 0 ?? Z 5:01PM 0:00.00 (sh) root 67392 0.0 0.0 448 212 p0 R+ 5:27PM 0:00.00 ps -aux
代码:
# ipcs Message Queues: T ID KEY MODE OWNER GROUP
Shared Memory: T ID KEY MODE OWNER GROUP
Semaphores: T ID KEY MODE OWNER GROUP
代码:
#vmstat procs memory page disks faults cpu r b w avm fre flt re pi po fr sr ad0 md0 in sy cs us sy id 0 0 0 199900 36940 428 0 0 0 493 183 0 0 267 3114 58 27 10 63 # iostat tty ad0 md0 cpu tin tout KB/t tps MB/s KB/t tps MB/s us ni sy in id 0 27 0.00 0 0.00 0.00 0 0.00 27 0 9 0 63
但我只要apache一重新启动,就ok了,但过1分钟,肯定又有僵尸进程了
代码:
last pid: 68030; load averages: 0.23, 0.39, 0.41 up 6+22:56:56 17:41:24 26 processes: 1 running, 25 sleeping CPU states: 0.0% user, 0.0% nice, 0.0% system, 0.0% interrupt, 100% idle Mem: 78M Active, 235M Inact, 89M Wired, 27M Cache, 60M Buf, 70M Free Swap: 512M Total, 132K Used, 512M Free
PID USERNAME PRI NICE SIZE RES STATE TIME WCPU CPU COMMAND 1177 mysql 2 0 106M 54568K poll 550:27 1.95% 1.95% mysqld 64692 root 10 0 8544K 7940K nanslp 15:26 0.00% 0.00% perl 100 root 2 0 944K 480K select 0:47 0.00% 0.00% syslogd 47801 root 2 0 4348K 3244K select 0:04 0.00% 0.00% httpd 109 root 2 0 2320K 1072K select 0:03 0.00% 0.00% sshd 137 root 2 0 1748K 920K select 0:02 0.00% 0.00% proftpd 107 root 10 0 1012K 516K nanslp 0:02 0.00% 0.00% cron 68030 root 28 0 1900K 1064K RUN 0:00 0.00% 0.00% top 67384 root 2 0 2448K 1648K select 0:00 0.00% 0.00% sshd 67386 root 18 0 1280K 828K pause 0:00 0.00% 0.00% csh 67385 xiyang 10 0 636K 264K wait 0:00 0.00% 0.00% sh 1158 root 10 0 648K 264K wait 0:00 0.00% 0.00% sh 197 root 3 0 948K 460K ttyin 0:00 0.00% 0.00% getty 191 root 3 0 948K 460K ttyin 0:00 0.00% 0.00% getty 196 root 3 0 948K 460K ttyin 0:00 0.00% 0.00% getty 194 root 3 0 948K 460K ttyin 0:00 0.00% 0.00% getty 195 root 3 0 948K 460K ttyin 0:00 0.00% 0.00% getty 198 root 3 0 948K 460K ttyin 0:00 0.00% 0.00% getty 192 root 3 0 948K 460K ttyin 0:00 0.00% 0.00% getty 193 root 3 0 948K 460K ttyin 0:00 0.00% 0.00% getty 1157 root 18 0 1276K 848K pause 0:00 0.00% 0.00% csh 68025 nobody 2 0 4348K 3272K accept 0:00 0.00% 0.00% httpd
请问这是什么原因造成的?
还有一个让我奇怪的事情: /etc/rc.conf sendmail_enable="NONE"
而且还: chmod 0 /usr/sbin/sendmail 怎么还给我跑出个那sendmail???
[此贴被 夕阳(cimsxiyang) 在 06月09日18时41分 编辑过]
|