divne padani CPU GPF pri diskovych operacich nad SW SATA (?)

Petr Stehlik pstehlik na sophics.cz
Pátek Březen 24 11:09:42 CET 2006


Zdar vsem,

cas od casu mi podivne spadne/vytuhne nejaky proces, vetsinou pri mirne
intenzivnejsich diskovych operacich probihajicich v noci v cronu (napr.
find / nebo rsync). Je tam ext3fs /dev/md0 150 GB (37% free), SW RAID 1
nad SATA disky, VIA chipset K8T800. Jeste je tam jeden zapomenuty PATA
disk 120 GB disk mountnuty na /data, 0% free...

AMD Athlon(tm) 64 Processor 3000+
RAID bus controller: VIA Technologies, Inc. VIA VT6420 SATA RAID
Controller (rev 80)
IDE interface: VIA Technologies, Inc. VT82C586A/B/VT82C686/A/B/VT823x/A/C PIPC Bus Master IDE (rev 06)
1 GB RAM, 1+1 GB swap na samostatne partitions (mimo RAID) a vetsinou nevyuzity.
Slusny zdroj (550 W tusim), slusny, pekne odvetrany case.
Teplota chipsetu 23 st.C, CPU 33 st.C, disku 30 st.C. a S.M.A.R.T. je klidny.

Kernel si pri tom stezuje takto:

Mar 23 05:03:22 www kernel: general protection fault: 0000 [1] 
Mar 23 05:03:23 www kernel: CPU 0 
Mar 23 05:03:23 www kernel: Modules linked in: ipv6 ipt_MASQUERADE ipt_REJECT ipt_LOG ipt_state ipt_pkttype ipt_recent ipt_iprange ipt_physdev ipt_multiport ipt_conntrack iptable_mangle ip_nat_irc ip_nat_tftp ip_nat_ftp iptable_nat ip_nat ip_conntrack_irc ip_conntrack_tftp ip_conntrack_ftp ip_conntrack nfnetlink iptable_filter ip_tables sk98lin psmouse joydev serio_raw pcspkr evdev i2c_viapro i2c_core shpchp pci_hotplug ext3 jbd mbcache raid1 md_mod ide_cd cdrom ide_disk ide_generic sd_mod sata_via via82cxxx generic ide_core libata scsi_mod 3c59x mii skge thermal processor fan
Mar 23 05:03:23 www kernel: Pid: 134, comm: kswapd0 Not tainted 2.6.15-1-amd64-k8 #2
Mar 23 05:03:23 www kernel: RIP: 0010:[prune_dcache+264/388] <ffffffff8017ba7d>{prune_dcache+264}
Mar 23 05:03:23 www kernel: RSP: 0018:ffff81003e403db8  EFLAGS: 00010282
Mar 23 05:03:23 www kernel: RAX: ffff810008e671b8 RBX: ffff810008e67150 RCX: ffff81002dc07da0
Mar 23 05:03:23 www kernel: RDX: fffe81002dc07da0 RSI: ffff810008e67160 RDI: ffffffff803be360
Mar 23 05:03:23 www kernel: RBP: ffff81002dc07d70 R08: 0000000000000064 R09: 0000000000000000
Mar 23 05:03:23 www kernel: R10: 000000000000000c R11: 0000000000000000 R12: 0000000000000043
Mar 23 05:03:23 www kernel: R13: 0000000000000020 R14: 00000000000000d0 R15: 0000000000000000
Mar 23 05:03:23 www kernel: FS:  00002aaaabfa1a00(0000) GS:ffffffff803e4800(0000) knlGS:00000000556b55a0
Mar 23 05:03:23 www kernel: CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
Mar 23 05:03:23 www kernel: CR2: 00002aaaac39b67f CR3: 000000003cb92000 CR4: 00000000000006e0
Mar 23 05:03:23 www kernel: Process kswapd0 (pid: 134, threadinfo ffff81003e402000, task ffff81003eb7c0c0)
Mar 23 05:03:23 www kernel: Stack: ffff81003ffa8480 0000000000006144 0000000000000082 ffffffff8017be79 
Mar 23 05:03:23 www kernel:        0000000000000064 ffffffff80155ca9 000000000003839e ffffffff8032ebf0 
Mar 23 05:03:23 www kernel:        0000000000000001 ffffffff8032e9c0 
Mar 23 05:03:23 www kernel: Call Trace:<ffffffff8017be79>{shrink_dcache_memory+19} <ffffffff80155ca9>{shrink_slab+233}
Mar 23 05:03:23 www kernel:        <ffffffff80157012>{balance_pgdat+571} <ffffffff80157248>{kswapd+272}
Mar 23 05:03:23 www kernel:        <ffffffff8013ea17>{autoremove_wake_function+0} <ffffffff8013ea17>{autoremove_wake_function+0}
Mar 23 05:03:23 www kernel:        <ffffffff8010efbe>{child_rip+8} <ffffffff80157138>{kswapd+0}
Mar 23 05:03:23 www kernel:        <ffffffff8010efb6>{child_rip+0} 
Mar 23 05:03:23 www kernel: 
Mar 23 05:03:23 www kernel: Code: 48 89 4a 08 48 89 11 48 89 40 08 48 89 43 68 83 7d 50 00 75 
Mar 23 05:03:23 www kernel: RIP <ffffffff8017ba7d>{prune_dcache+264} RSP <ffff81003e403db8>

anebo o chvili pozdeji takto:

Mar 23 06:26:21 www kernel:  <0>general protection fault: 0000 [2] 
Mar 23 06:26:21 www kernel: CPU 0 
Mar 23 06:26:21 www kernel: Modules linked in: ipv6 ipt_MASQUERADE ipt_REJECT ipt_LOG ipt_state ipt_pkttype ipt_recent ipt_iprange ipt_physdev ipt_multiport ipt_conntrack iptable_mangle ip_nat_irc ip_nat_tftp ip_nat_ftp iptable_nat ip_nat ip_conntrack_irc ip_conntrack_tftp ip_conntrack_ftp ip_conntrack nfnetlink iptable_filter ip_tables sk98lin psmouse joydev serio_raw pcspkr evdev i2c_viapro i2c_core shpchp pci_hotplug ext3 jbd mbcache raid1 md_mod ide_cd cdrom ide_disk ide_generic sd_mod sata_via via82cxxx generic ide_core libata scsi_mod 3c59x mii skge thermal processor fan
Mar 23 06:26:21 www kernel: Pid: 3594, comm: find Not tainted 2.6.15-1-amd64-k8 #2
Mar 23 06:26:21 www kernel: RIP: 0010:[__d_find_alias+23/172] <ffffffff8017b840>{__d_find_alias+23}
Mar 23 06:26:21 www kernel: RSP: 0018:ffff81002b989c50  EFLAGS: 00010287
Mar 23 06:26:21 www kernel: RAX: 0000000000008000 RBX: ffff81002dc07d70 RCX: fffe81002dc07da0
Mar 23 06:26:21 www kernel: RDX: fffe81002dc07da0 RSI: 0000000000000001 RDI: ffff81002dc07d70
Mar 23 06:26:21 www kernel: RBP: ffff810004de6660 R08: 0000000000000000 R09: ffff810008e67150
Mar 23 06:26:21 www kernel: R10: ffff81002dc07da0 R11: ffff81002eaa72e8 R12: 0000000000000000
Mar 23 06:26:21 www kernel: R13: ffff81001ecf4150 R14: ffff81002b989d28 R15: ffff81002b989e68
Mar 23 06:26:21 www kernel: FS:  00002aaaaae00640(0000) GS:ffffffff803e4800(0000) knlGS:00000000556b55a0
Mar 23 06:26:21 www kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
Mar 23 06:26:21 www kernel: CR2: 00002aaaab19a000 CR3: 0000000021ce3000 CR4: 00000000000006e0
Mar 23 06:26:21 www kernel: Process find (pid: 3594, threadinfo ffff81002b988000, task ffff8100305896b0)
Mar 23 06:26:21 www kernel: Stack: ffffffff8017c324 ffff81002dc07d70 ffff810004de6660 ffff81003e0c0000 
Mar 23 06:26:21 www kernel:        ffffffff880eeb91 ffff810013210a68 ffff81002eaa72e8 ffff810004de6660 
Mar 23 06:26:21 www kernel:        fffffffffffffff4 ffff810013210a68 
Mar 23 06:26:21 www kernel: Call Trace:<ffffffff8017c324>{d_splice_alias+32} <ffffffff880eeb91>{:ext3:ext3_lookup+146}
Mar 23 06:26:21 www kernel:        <ffffffff80172ecc>{real_lookup+112} <ffffffff80173194>{do_lookup+93}
Mar 23 06:26:21 www kernel:        <ffffffff80173a1a>{__link_path_walk+2126} <ffffffff80173ec4>{link_path_walk+76}
Mar 23 06:26:21 www kernel:        <ffffffff8017427e>{path_lookup+363} <ffffffff801744c8>{__user_walk+46}
Mar 23 06:26:21 www kernel:        <ffffffff8016f466>{vfs_lstat+21} <ffffffff8016f7a4>{sys_newlstat+17}
Mar 23 06:26:21 www kernel:        <ffffffff8010e4de>{system_call+126} 
Mar 23 06:26:21 www kernel: 
Mar 23 06:26:21 www kernel: Code: 48 8b 12 0f 18 0a 0f b7 47 4c 4c 8d 49 98 25 00 f0 00 00 3d 
Mar 23 06:26:21 www kernel: RIP <ffffffff8017b840>{__d_find_alias+23} RSP <ffff81002b989c50>

Jsem z toho velmi neklidny. Uz jsem zkousel kernely 2.6.8, .12, .14 a
ted mam .15 (vsechno debianni pro k8) a problem pretrvava, takze to bude
spis nejaky HW problem, si myslim.

Nema nekdo tuseni, odkud vane vitr? A jak ho presmerovat jinam (treba
do /dev/null)?

Diky.

Petr




Další informace o konferenci Linux