PDA

View Full Version : Current Segmentation Fault Woes



Aelorean
07-03-2002, 03:14 PM
I am having a heck of a time with just utterly random segmentation faults of ShowEQ. I did read on one of the boards at one point that the reason for the crash is known (memory overwriting, or something to that effect).

I was just wondering if this was something that was being investigated or if the crashes are not as frequent for other users. I know that my ShowEQ crashes at least once every couple hours.

S_B_R
07-03-2002, 03:37 PM
Actually, this happens to me as well, maybe not every 2hours, it seems to be VERY random for me. But I don't think it's anything directly in SEQ. I think it's a bad packet or something from Verant. The reason I say that is there are 3 people I play with on a regular basis, all 3 of them also use SEQ, and 99% of the time if we are all grouped together and in that same zone, everyones SEQ session will crash at almost exactly the same time.....

high_jeeves
07-03-2002, 03:49 PM
Your best bet is to post a backtrace of the crash next time it happens so that the devs can take a look at it.

Somewhere on these boards, are instructions for how to do this.. search for "backtrace" and they should pop up. Just post the output here so the devs have somewhere to start from.

--Jeeves

Aelorean
07-03-2002, 05:03 PM
The first one given below happens quite frequently, I am not listing here duplicate bts from different core files, obviously.

Here are a few examples:



#0 0x40162b93 in QGDictIterator::operator++() () at eval.c:41
#1 0x080f64ce in Map::paintDrops(MapParameters&, QPainter&) (this=0x81fcfb0, param=@0x81fd028, p=@0xbfffe8d0)
at /usr/local/qt/include/qintdict.h:97
#2 0x080f5ac3 in Map::paintMap(QPainter*) (this=0x81fcfb0, p=0xbfffea30) at map.cpp:2928
#3 0x080f8900 in Map::paintEvent(QPaintEvent*) (this=0x81fcfb0, e=0xbfffec70) at map.cpp:4039
#4 0x40270234 in QWidget::event(QEvent*) () at eval.c:41
#5 0x401d651c in QApplication::notify(QObject*, QEvent*) () at eval.c:41
#6 0x401cc621 in QWidget::repaint(int, int, int, int, bool) () at eval.c:41
#7 0x080f5492 in Map::refreshMap() (this=0x81fcfb0) at /usr/local/qt/include/qrect.h:195
#8 0x4021ff36 in QObject::activate_signal(char const*) () at eval.c:41
#9 0x4027dc6f in QTimer::timeout() () at eval.c:41
#10 0x4025cdfb in QTimer::event(QEvent*) () at eval.c:41
#11 0x401d651c in QApplication::notify(QObject*, QEvent*) () at eval.c:41
#12 0x4019ecf0 in qt_activate_timers() () at eval.c:41
#13 0x4019c7a8 in QApplication::processNextEvent(bool) () at eval.c:41
#14 0x401d884b in QApplication::enter_loop() () at eval.c:41
#15 0x4019c2e0 in QApplication::exec() () at eval.c:41
#16 0x08062635 in main (argc=1, argv=0xbffffa44) at main.cpp:927
#17 0x4070a507 in __libc_start_main (main=0x805f498 <main>, argc=1, ubp_av=0xbffffa44, init=0x805b02c <_init>,
fini=0x8171fd0 <_fini>, rtld_fini=0x4000dc14 <_dl_fini>, stack_end=0xbffffa3c) at ../sysdeps/generic/libc-start.c:129



#0 0x4076eda2 in chunk_free (ar_ptr=0x40822620, p=0x84c0c10) at malloc.c:3252
#1 0x4076ebf4 in __libc_free (mem=0x84c0f78) at malloc.c:3154
#2 0x406a52c2 in operator delete(void*) (ptr=0x84c0f78) at ../../../../libstdc++-v3/libsupc++/del_op.cc:39
#3 0x0816ca7f in ~QValueListPrivate (this=0x8277460) at /usr/local/qt/include/qstring.h:652
#4 0x080b8be2 in ~EQInterface (this=0xbfffefa0) at /usr/local/qt/include/qvaluelist.h:198
#5 0x08062649 in main (argc=1, argv=0xbffffa34) at main.cpp:927
#6 0x4070a507 in __libc_start_main (main=0x805f498 <main>, argc=1, ubp_av=0xbffffa34, init=0x805b02c <_init>,
fini=0x8171fd0 <_fini>, rtld_fini=0x4000dc14 <_dl_fini>, stack_end=0xbffffa2c) at ../sysdeps/generic/libc-start.c:129



#0 0x00000004 in __strtol_internal (nptr=0x8280000 "hÿ'\b¨z\035\b", endptr=0xbfffe850, base=-1073747864, group=1075193746)
at eval.c:36
#1 0x401627c7 in QGDictIterator::QGDictIterator(QGDict const&) () at eval.c:41
#2 0x080f6429 in Map::paintDrops(MapParameters&, QPainter&) (this=0x81fcfb0, param=@0x81fd028, p=@0xbfffe8d0)
at /usr/local/qt/include/qintdict.h:88
#3 0x080f5ac3 in Map::paintMap(QPainter*) (this=0x81fcfb0, p=0xbfffea30) at map.cpp:2928
#4 0x080f8900 in Map::paintEvent(QPaintEvent*) (this=0x81fcfb0, e=0xbfffec70) at map.cpp:4039
#5 0x40270234 in QWidget::event(QEvent*) () at eval.c:41
#6 0x401d651c in QApplication::notify(QObject*, QEvent*) () at eval.c:41
#7 0x401cc621 in QWidget::repaint(int, int, int, int, bool) () at eval.c:41
#8 0x080f5492 in Map::refreshMap() (this=0x81fcfb0) at /usr/local/qt/include/qrect.h:195
#9 0x4021ff36 in QObject::activate_signal(char const*) () at eval.c:41
#10 0x4027dc6f in QTimer::timeout() () at eval.c:41
#11 0x4025cdfb in QTimer::event(QEvent*) () at eval.c:41
#12 0x401d651c in QApplication::notify(QObject*, QEvent*) () at eval.c:41
#13 0x4019ecf0 in qt_activate_timers() () at eval.c:41
#14 0x4019c7a8 in QApplication::processNextEvent(bool) () at eval.c:41
#15 0x401d884b in QApplication::enter_loop() () at eval.c:41
#16 0x4019c2e0 in QApplication::exec() () at eval.c:41
#17 0x08062635 in main (argc=1, argv=0xbffffa44) at main.cpp:927
#18 0x4070a507 in __libc_start_main (main=0x805f498 <main>, argc=1, ubp_av=0xbffffa44, init=0x805b02c <_init>,
fini=0x8171fd0 <_fini>, rtld_fini=0x4000dc14 <_dl_fini>, stack_end=0xbffffa3c) at ../sysdeps/generic/libc-start.c:129



#0 0x40161328 in QGDict::look_int(long, void*, int) () at eval.c:41
]#1 0x0806da8c in SpawnShell::updateSpawn(unsigned short, short, short, short, short, short, short, signed char, signed char, unsigned char) (this=0x82a96a8, id=699, x=-485, y=433, z=-34, xVel=7, yVel=5, zVel=0, heading=37 '%',
deltaHeading=0 '\000', animation=5 '\005') at /usr/local/qt/include/qintdict.h:64
#2 0x0806dd42 in SpawnShell::updateSpawns(mobUpdateStruct const*) (this=0x82a96a8, updates=0xbfffe77f)
at spawnshell.cpp:623
#3 0x0808fb6e in EQPacket::updateSpawns(mobUpdateStruct const*, unsigned, unsigned char) (this=0x82d8b30, t0=0xbfffe77f,
t1=171, t2=2 '\002') at m_packet.cpp:2102
#4 0x0808a5fc in EQPacket::dispatchZoneData(unsigned, unsigned char*, unsigned char) (this=0x82d8b30, len=171,
data=0xbfffe77f "\237 \013", dir=2 '\002') at packet.cpp:2147
#5 0x080880bc in EQPacket::decodePacket(int, unsigned char*) (this=0x82d8b30, size=208, buffer=0xbfffe75e "E")
at packet.h:401
#6 0x0808760c in EQPacket::processPackets() (this=0x82d8b30) at packet.cpp:782
#7 0x4021ff36 in QObject::activate_signal(char const*) () at eval.c:41
#8 0x4027dc6f in QTimer::timeout() () at eval.c:41
#9 0x4025cdfb in QTimer::event(QEvent*) () at eval.c:41
#10 0x401d651c in QApplication::notify(QObject*, QEvent*) () at eval.c:41
#11 0x4019ecf0 in qt_activate_timers() () at eval.c:41
#12 0x4019c7a8 in QApplication::processNextEvent(bool) () at eval.c:41
#13 0x401d884b in QApplication::enter_loop() () at eval.c:41
#14 0x4019c2e0 in QApplication::exec() () at eval.c:41
#15 0x08062635 in main (argc=1, argv=0xbffffa44) at main.cpp:927
#16 0x4070a507 in __libc_start_main (main=0x805f498 <main>, argc=1, ubp_av=0xbffffa44, init=0x805b02c <_init>,
fini=0x8171fd0 <_fini>, rtld_fini=0x4000dc14 <_dl_fini>, stack_end=0xbffffa3c) at ../sysdeps/generic/libc-start.c:129



#0 0x40162ab0 in QGDictIterator::toFirst() () at eval.c:41
#1 0x401627aa in QGDictIterator::QGDictIterator(QGDict const&) () at eval.c:41
#2 0x080f6890 in Map::paintSpawns(MapParameters&, QPainter&, QTime const&) (this=0x81fcfb0, param=@0x81fd028,
p=@0xbfffe8d0, drawTime=@0xbfffe8a0) at /usr/local/qt/include/qintdict.h:88
#3 0x080f5b41 in Map::paintMap(QPainter*) (this=0x81fcfb0, p=0xbfffea30) at map.cpp:2940
#4 0x080f8900 in Map::paintEvent(QPaintEvent*) (this=0x81fcfb0, e=0xbfffec70) at map.cpp:4039
#5 0x40270234 in QWidget::event(QEvent*) () at eval.c:41
#6 0x401d651c in QApplication::notify(QObject*, QEvent*) () at eval.c:41
#7 0x401cc621 in QWidget::repaint(int, int, int, int, bool) () at eval.c:41
#8 0x080f5492 in Map::refreshMap() (this=0x81fcfb0) at /usr/local/qt/include/qrect.h:195
#9 0x4021ff36 in QObject::activate_signal(char const*) () at eval.c:41
#10 0x4027dc6f in QTimer::timeout() () at eval.c:41
#11 0x4025cdfb in QTimer::event(QEvent*) () at eval.c:41
#12 0x401d651c in QApplication::notify(QObject*, QEvent*) () at eval.c:41
#13 0x4019ecf0 in qt_activate_timers() () at eval.c:41
#14 0x4019c7a8 in QApplication::processNextEvent(bool) () at eval.c:41
#15 0x401d884b in QApplication::enter_loop() () at eval.c:41
#16 0x4019c2e0 in QApplication::exec() () at eval.c:41
#17 0x08062635 in main (argc=1, argv=0xbffffa44) at main.cpp:927
#18 0x4070a507 in __libc_start_main (main=0x805f498 <main>, argc=1, ubp_av=0xbffffa44, init=0x805b02c <_init>,
fini=0x8171fd0 <_fini>, rtld_fini=0x4000dc14 <_dl_fini>, stack_end=0xbffffa3c) at ../sysdeps/generic/libc-start.c:129

Aelorean
07-03-2002, 05:22 PM
Although I'm sure it's always good to see more dump files, I did notice a previous thread on this with almost the same backtraces.

Sorry for any troubles with posting what's already been posted

http://seq.sourceforge.net/showthread.php?s=&threadid=1349

g0hst
07-03-2002, 11:46 PM
there is also this one http://seq.sourceforge.net/showthread.php?s=&threadid=1537 which is the only one I really get. I sure hope Zaphod can figure out how to fix it cause I sure as hell cant :D still working on it tho, my C is really rusty, stupid java.

edit: I suppose we could just submit all these as bugs so we dont have 15 million segfault threads :rolleyes:

fee
07-04-2002, 01:19 AM
From a Dev's perspective - This bug is just WACK! The process of debugging this particular problem is difficult. The cause is based on a particular sequence of events. To begin fixing the problem we must first determine the event sequence that causes this. The events I am refering to are the incoming packets.

So lets say you have been in a zone for an hour or more and BOOM the segfault. Now we would require a packet capture from the time you zoned in to the time you segfault. Next we would have to follow each spawn, item, player, etc and all the events related to them.

If you are starting to get an understanding of the task at hand then your head must hurt as bad as my own.

Any help would be greatly appreciated.

fee

Aelorean
07-04-2002, 03:13 AM
Please feel free to correct me if I'm wrong (I havn't looked at this code at all really, so it's very possible I'm way off base). But, if we're certain that it is a 'bad' packet that is causing the error, wouldn't it make sense to go through the initial, low-level packet handling routine and code it so that if it encounters something that it does not understand, it handles the error differently?

Here's another question: If ShowEQ received the EXACT same packet twice, but with one bit altered (ie, a bug), would that cause a problem?

....grasping at straws, hoping to sound not entirely like a fool :)

fee
07-04-2002, 03:25 AM
The problem is not caused by a bad packet. The packet engine computes the checksum just as the EQ client. The packet engine also performs sequencing and duplicate rejection. Its not a packet level issue, its a content issue.

fee

g0hst
07-04-2002, 09:51 AM
In my case it seems like really only two things could be happening, one, that the value at spawn.h line 112 is getting set with an invalid value (where?), in that case there just needs to be some sanity checking on what gets put into these objects; or, that there are some synchronization issues and that the object is getting deleted by another thread or something out from under paintSpawns() when it tries to read it.

the thread idea may be completely off base tho, I dont know enough about how the program as a whole works yet.

sequser
07-08-2002, 03:16 AM
I see this most often when there is a group doing AoE in a zone.

One theory I had was it was caused when a corpse with no loot on it was looted right when it dies. This is mostly based on it happening a few times to me when I was the one that looted the corpse. (ie, mob dies, I loot immediately, nothing on it and I get off corpse right away, boom seg fault).

Similarly, it seems that the spawn that was deleted before it was processed is a mob corpse spawn.

Take a wizard to a zone with a lot of greens (which have the possiblity of loading no loo) thatt you can nuke all at once and I bet you can reproduce this bug pretty frequently.

Or alternatively, just add a lot more santity checking code with the corpse spawns and see if some of these asserts trip when we are all using it.

I certainly get a seg fault at least once a day, depending on the zone.

PriestOfDiscord
07-10-2002, 11:47 PM
Along with everybody else, every couple hours I get a Seg Fault in OT. I noticed that on my last seg fault a group member went LD at the exact same instance. I'm not sure if eq crashed for him or if he just lost his connection. It could have been a coincidence, but wanted to add a data point. No core was left after the crash, so I was not able to do a backtrace.

troll
07-11-2002, 08:51 PM
For what it's worth, I used to get at least once segment fault a night on my Redhat 7.2 system with qt 2.3.2 and gcc3.01. Recently, I rebuilt that box with into a Gentoo system with qt 2.3.2 and gcc 3.1 and haven't had a segment fault since *knocks on wood* Could just be a coincidence of course but I thought I would toss it out there that maybe it's an issue with the OS or the gcc. Not saying anyone go out there and start changing their system, but if someone wants to build a second box and just try out another configuration and compare it against a standard one, then here you go. Ok done rambling and crawling back into my hole.

docster
07-13-2002, 08:56 AM
I have the same troubles, seems to be real random but I have started to think it is linked to a no loot mob. Mine may run for 8 hours then again it may crash three times in 30 minutes. It always exits with a variant of this:

icebone_skeleton03(2062) has already been removed from the zone before we processed it.

Segmentation fault

Nobody
07-13-2002, 06:12 PM
It seems every time mine faults I get a "<something|Unknown> has been removed before we could process it" or whatever that message is. It sounds like something is despawned while it's in a loop being processed, then the data/names are compared and something freaks out.

Cleric
07-15-2002, 03:22 PM
I have had this happening often also on Mandrake 8.2
(gcc-3.0.4 Qt-3.0.4)

Last night it definately happened on a group member going linkdead. I know when I used SEQ before that it used to allow you to at least see the monster locations if you had to restart SEQ after a crash - now when I bring it back up it just sits there and does not show anything until I either camp and come back in, or zone. I used to like it before, because even if I didn't know what the mobile was, I could still use the map, compass and locations to avoid stuff - now I have to relog somehow before it is useful again. Is this just me, or is it by design now?

Thanks.

fgay trader
07-15-2002, 03:45 PM
originaly posted by Cleric
now when I bring it back up it just sits there and does not show anything

Try the "--restore-all" command-line parameter when starting up SEQ after a crash. That should bring back most of the useful stuff like Map, Decode Key, last known mob/player positions, etc. Also, play with the options (can't remember the menu under which you find them) to set exactly what is saved for the "--restore..." options to.. em.. restore.

tristanbfg
07-16-2002, 04:05 AM
I have this exact same issue with SEQ. I normally see a message "(a gnoll pup 01) was already removed from zone before we processed it." or something like that. I'ts too late here to remember exactly.

dbrot
07-17-2002, 08:32 PM
I get that same error when SEQ seg faults.
(spawn such & such) has already been removed from the zone before we processed it.

PD_Dingo
07-18-2002, 01:57 AM
Originally posted by fee
The problem is not caused by a bad packet. The packet engine computes the checksum just as the EQ client. The packet engine also performs sequencing and duplicate rejection. Its not a packet level issue, its a content issue.

fee

just a stupid idea, back in my mud coding days i used signal handler to do various things when the mud server segfaulted, beside the fact that it would do 'hot boot' which could be implemented in seq quite easily (more about that below) it would dump info in a log file like the file and line in the code where it crashed, this could be used to maybe dump the content of the last X packets received so it could be uploaded and someone who knows what they doing could look over them and see if there is anything abnormal in them.

now about the hot boot, the signal handler would catch the segfault, it would dump the debug info, then all the open descriptor numbers and their states, then fork, execl the mud server with argument that will make it read the saved descriptor data and finally die, dumping core. I'm not sure how or if this will work with X application since i know nothing about X programming, but it may work :D hehe

Dingo

Cleric
07-19-2002, 03:02 PM
Ratt and all you fine devs out there -

Is there any update on this, or do you need more information from us as to what or where the problem might be? Just curious.

Thanks

toddlar
07-23-2002, 12:37 PM
redhat 7.3

I seem to have somewhat random seg faults as others have said. Hehe, I may just be ignorant. But, it seems like it seg faults more when I have Konqueror(kde web browser) up and running. When its not up it will just randomly seg fault. Don't know if this helps

One mans speculation,

Todd

troll
07-23-2002, 12:44 PM
Could be paraniod and just say it's Verant sending crap data down the pipe just to spite ShowEQ Users. =)

As an update to my post, I am still getting segmentation faults in my configuration, just not as often. I don't get any in Howling Stones, but I do seem to get a lot in Kael for what it's worth. And when I go down usually so do others that I know are using ShowEQ in the same zone, could be coincidence still don't have enough data to say for certain.

Just my two cents..

Ratt
07-23-2002, 02:45 PM
It's not some plot by Verant (there are much better ways to go about it) ... it's just something writing off the end of a structure or array somewhere and stomping on important bits.

The problem with tracking it down is we have no packet logs leading UPTO when this happens.

If you are willing to sacrifice your anonimity and send us a (huge) packet log of everything that happened since you zoned up until the time of the seg fault, it would help... but then we'd know who you are, what ever you play on, etc... So it's kinda hard to track down :)

fee
07-25-2002, 01:03 AM
The patch I added yesterday SHOULD have solved most of these seg faults. Please let me know if you continue to experience them and if possible provide a strack trace.

Fee

BlueAdept
07-25-2002, 07:31 AM
Great going on finding that bug.

It is kind of funny hearing people complain about seg faults. Back in the HQ days, seq would seg usually at least once an hour and there was no --restore-all.

Now all those creative ways to tell a group that you have to camp for a minute (because seq segged) have virtually been eliminated.

SEQ has really come a long way in 3 years. Thanks for keeping it alive and improving it so much.

g0hst
07-26-2002, 05:53 AM
I'm still getting this (http://seq.sourceforge.net/showthread.php?s=&threadid=1537) one after the patch.

Nobody
07-27-2002, 02:41 PM
I just got this trying to compile the new patch fee:

g++ -D_REENTRANT -O2 -Wall -g -ggdb -DDEBUG -finline-functions -DQT_THREAD_SUPPORT=1 -DDISPLAY_ICONS=false -DICON_DIR=\"/eq-icons/\" -o sortitem sortitem.o util.o -L/usr/lib/qt-gcc3/lib -lqt-mt -lpthread /usr/local/lib/libEQ.a -lgdbm -lz -lpcap -Wl,--rpath -Wl,/usr/lib/qt-gcc3/lib -Wl,--rpath -Wl,/usr/X11R6/lib
/usr/bin/ld: warning: libstdc++.so.4, needed by /usr/lib/qt-gcc3/lib/libqt-mt.so, may conflict with libstdc++.so.3
make[2]: Leaving directory `/home/Nobody/EQ/CVS/showeq/src'

This is using the qt debs from Azriel.

Pigeon
07-27-2002, 11:20 PM
Now all those creative ways to tell a group that you have to camp for a minute (because seq segged) have virtually been eliminated.

Aww crap! I have my AA's on 100% normal xp...

(five minutes later)WTF! It didn't change my aa settings...

docster
08-07-2002, 08:07 AM
I have not seen the error: <whatever> has been removed from the zone before we processed it. ANYMORE! Woot!

Been up for 4 days without restarting it now.

Thanks Fee!

fgay trader
08-07-2002, 11:32 AM
Originally posted by docster
Been up for 4 days without restarting it now.

4 days non-stop EQing? Get some sleep! Jeeeeeez! :p