nahodne zamrznuti pocitace

Michal Špaček skim na deltaes.cz
Středa Říjen 31 16:53:48 CET 2001


On Wed, Oct 31, 2001 at 04:49:22PM +0100, Michal Špaček wrote:
> On Wed, Oct 31, 2001 at 04:24:41PM +0100, Richard Kotal wrote:
> > pocitac ve windows nahodne pada do modre obrazovky. V linuxu bych rekl, ze
> > ve stejnou dobu misto padu vypise na konzoli tuto hlasku:
> > Uhhuh. NMI received for unknown reason 3d.
> > Dazed and confused, but trying to continue.
> > Do you have a strange power saving mode enable?

Jeste bych mel dodat tohle:
cat /usr/src/linux/Documentation/nmi_watchdog.txt

Is your ix86 system locking up unpredictably? No keyboard activity,
just
a frustrating complete hard lockup? Do you want to help us debugging
such lockups? If all yes then this document is definitely for you.

On Intel and similar ix86 type hardware there is a feature that
enables
us to generate 'watchdog NMI interrupts'.  (NMI: Non Maskable
Interrupt
which get executed even if the system is otherwise locked up hard).
This can be used to debug hard kernel lockups.  By executing periodic
NMI interrupts, the kernel can monitor whether any CPU has locked up,
and print out debugging messages if so.  You must enable the NMI
watchdog at boot time with the 'nmi_watchdog=n' boot parameter.  Eg.
the relevant lilo.conf entry:

        append="nmi_watchdog=1"

For SMP machines and UP machines with an IO-APIC use nmi_watchdog=1.
For UP machines without an IO-APIC use nmi_watchdog=2, this only works
for some processor types.  If in doubt, boot with nmi_watchdog=1 and
check the NMI count in /proc/interrupts; if the count is zero then
reboot with nmi_watchdog=2 and check the NMI count.  If it is still
zero then log a problem, you probably have a processor that needs to
be
added to the nmi code.

A 'lockup' is the following scenario: if any CPU in the system does
not
execute the period local timer interrupt for more than 5 seconds, then
the NMI handler generates an oops and kills the process. This
'controlled crash' (and the resulting kernel messages) can be used to
debug the lockup. Thus whenever the lockup happens, wait 5 seconds and
the oops will show up automatically. If the kernel produces no
messages
then the system has crashed so hard (eg. hardware-wise) that either it
cannot even accept NMI interrupts, or the crash has made the kernel
unable to print messages.

NOTE: starting with 2.4.2-ac18 the NMI-oopser is disabled by default,
you have to enable it with a boot time parameter.  Prior to 2.4.2-ac18
the NMI-oopser is enabled unconditionally on x86 SMP boxes.

[ feel free to send bug reports, suggestions and patches to
  Ingo Molnar <mingo na redhat.com> or the Linux SMP mailing
  list at <linux-smp na vger.kernel.org> ]

skim
-- 
---------------------------------------------------
  Michal "sKim" Špaček         	Brno, CZ, Europe
 E-mail: skim na deltaes.com	
    icq: 66962942		user: debian, TeX		
------=[ #!/usr/bin/perl ]=------------------------


Další informace o konferenci Linux