[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[ale] Something I thought I'd never see
- Subject: [ale] Something I thought I'd never see
- From: DCox at icc.net (Danny Cox)
- Date: Wed, 11 Oct 2006 09:56:42 -0400
- In-reply-to: <[email protected]>
- References: <[email protected]>
Jeff,
On Wed, 2006-10-11 at 09:31 -0400, Jeff Lightner wrote:
<snip>
> However I?m wondering how I might have figured this out if I hadn?t
> been able to narrow down the day except by running ps ?ef and looking
> for oddities such as the ones I found? This prompted the question
> above. I often see what appear to me to be abnormally high load
> averages (as compared to what I?d think reasonable on the UNIX boxes
> I?ve worked on) but they don?t seem to actually impact performance
> overall.
With a "ps ef" you'll continually see processes stuck in 'D' state.
Usually, you'll only be able to catch one or two in that state, and the
next time you run ps, they'll be 'R'unning or 'S'leeping.
'D' is described as a "short sleep". It's present during the time the
kernel is running on behalf of the process doing disk I/O. That's
usually much less than a second.
So, if you're continually seeing processes stuck in 'D' state, that's
probably filesystem corruption, or a disk slowly dieing.
You can do an ls -l on /proc/<pid>/fd to see what files it has open.
One of those will be the problem child. You can then determine the
filesystem in question.
You might also try using strace -p <pid> to trace the process. It may
give the system call it's currently trying to use. If it does, the
first argument in a read or write is the fd. Then use
the /proc/<pid>/fd/<fd> to determine the filesystem in question.
Good luck!
--
Daniel S. Cox
Internet Commerce Corporation