BUG: sort (reseni) [Was: sort ma problemy s velkym souborem?]

David Kuzela david na kuzela.cz
Pondělí Říjen 8 22:45:19 CEST 2001


Oto Buchta:
> narocnost. Podle poctu souboru to vypada, ze pouziva castecny mergesort na 
> quicksortu. Nevim, hadam.

Presne tak :)


Problem je zrejme v tom, ze se nekde zacykli, tohle do omrzeni opakuje
ltrace (az do chvile nez mi doslo misto na disku):

strcoll(0x08052360, 0x080585a0, 0xbfffd38c, 0x0804d1ee, 3) = 1
strcoll(0x080585a0, 0x08052360, 0xbfffd38c, 0x0804d1ee, 19) = 62
strcoll(0x08052360, 0x080585a0, 0xbfffd38c, 0x0804d1ee, 3) = 1
strcoll(0x080585a0, 0x08052360, 0xbfffd38c, 0x0804d1ee, 19) = 62
strcoll(0x08052360, 0x080585a0, 0xbfffd38c, 0x0804d1ee, 3) = 1
strcoll(0x080585a0, 0x08052360, 0xbfffd38c, 0x0804d1ee, 19) = 62
strcoll(0x08052360, 0x080585a0, 0xbfffd38c, 0x0804d1ee, 3) = 1
strcoll(0x080585a0, 0x08052360, 0xbfffd38c, 0x0804d1ee, 19) = 62
...

-=> Porovnava porad dva stejne retezce, tj. je to nejspis bug.
Spravne by mel provadet neco jako:

strcoll(0x4026a392, 0x4026a424, 0x4026a45e, 30, 23) = 17
strcoll(0x4026a392, 0x4026a411, 0x4026a45e, 30, 19) = 17
strcoll(0x4026a392, 0x4026a45e, 0x4026a45e, 30, 29) = 4
strcoll(0x4026a392, 0x4026a43b, 0x4026a45e, 30, 35) = -4
strcoll(0x4026a3f7, 0x4026a43b, 0x4026a45e, 30, 35) = -1
memcpy(0xbfffd2b4, "\250S\246\263\272q\254P / \250S\246\263\274\320\303D\n", 20) = 0xbfffd2b4
strcoll(0x4026a4bb, 0xbfffd2b4, 20, 0, 0xb3a653a8) = -16
memcpy(0xbfffd2b4, "Assorted / Compilation #1\n", 26) = 0xbfffd2b4
strcoll(0x4026a4f0, 0xbfffd2b4, 26, 0, 0x6f737341) = 8
strcoll(0x4026a4bb, 0x4026a513, 0xbfffd2b4, 27, 26) = 5
strcoll(0x4026a4bb, 0x4026a4f0, 0xbfffd2b4, 27, 35) = -3
memcpy(0xbfffd2e4, "Helloween / Walls of Jericho/Jud"..., 35) = 0xbfffd2e4
strcoll(0x4026a4dc, 0xbfffd2e4, 35, 0x40020054, 0x6c6c6548) = 13
memcpy(0xbfffd2b4, "Annabel Lamb / Justice\n", 23) = 0xbfffd2b4
strcoll(0x4026a52d, 0xbfffd2b4, 23, 0x40020054, 0x616e6e41) = 1
memcpy(0xbfffd264, "Alphaville / First Harvest 1984-"..., 35) = 0xbfffd264
strcoll(0x4026a571, 0xbfffd264, 35, 1, 0x68706c41) = 25

Chyba je (u me) reprodukovatelna a opakuje se i kdyz si textutils prelozim 
ze zdrojaku.

Kdyz si prelozim sort s 16MB (pro jistotu :) bufferem, tak ten soubor
setriti za 0m5.584s! To odpovida mym predstavam.

Zkusim zjistit v cem to vezi a poslat report. 

-- 
                                                David Kuzela 
-=[david na kuzela.cz]=-=[ICQ][24470559]=-=[http://penguin.cz/~dawyd]=-


Další informace o konferenci Linux