PDA

View Full Version : Crash on 5.0.0.19 -- Stack Provided



Amadeus
04-17-2005, 11:55 PM
My ShowEQ kept crashing in proving grounds tonight. Here was a crash dump stack trace:



#0 0xffffe002 in ?? ()
(gdb) back
#0 0xffffe002 in ?? ()
#1 0x42028a73 in abort () from /lib/tls/libc.so.6
#2 0x40585687 in __cxxabiv1::__terminate(void (*)()) () from /usr/lib/qt-3.1/lib/libqt-mt.so.3
#3 0x405856d4 in std::terminate() () from /usr/lib/qt-3.1/lib/libqt-mt.so.3
#4 0x40585846 in __cxa_throw () from /usr/lib/qt-3.1/lib/libqt-mt.so.3
#5 0x40583872 in operator new(unsigned) () from /usr/lib/qt-3.1/lib/libqt-mt.so.3
#6 0x405838cf in operator new[](unsigned) () from /usr/lib/qt-3.1/lib/libqt-mt.so.3
#7 0x080845e7 in EQPacketFragmentSequence::addFragment(EQProtocolPa cket&) (this=0x82dfdd8, packet=@0x84a5d30)
at packetfragment.cpp:101
#8 0x08085f6b in EQPacketStream::processPacket(EQProtocolPacket&, bool) (this=0x82dfd68, packet=@0x84a5d30)
at packetstream.cpp:744
#9 0x080853ee in EQPacketStream::processCache() (this=0x82dfd68) at packetformat.h:170
#10 0x080909d0 in EQPacket::dispatchPacket(EQUDPIPPacketFormat&) (this=0xb260, packet=@0xbfffce20) at packet.cpp:744
#11 0x080906f4 in EQPacket::processPackets() (this=0x8344258) at packet.cpp:631
#12 0x08092a7d in EQPacket::qt_invoke(int, QUObject*) (this=0x8344258, _id=2, _o=0xbfffef20) at packet.moc:522
#13 0x4026e0c9 in QObject::activate_signal(QConnectionList*, QUObject*) () from /usr/lib/qt-3.1/lib/libqt-mt.so.3
#14 0x4026df6d in QObject::activate_signal(int) () from /usr/lib/qt-3.1/lib/libqt-mt.so.3
#15 0x4054f68b in QTimer::timeout() () from /usr/lib/qt-3.1/lib/libqt-mt.so.3
#16 0x4028ef12 in QTimer::event(QEvent*) () from /usr/lib/qt-3.1/lib/libqt-mt.so.3
#17 0x4020ff24 in QApplication::internalNotify(QObject*, QEvent*) () from /usr/lib/qt-3.1/lib/libqt-mt.so.3
#18 0x4020fb19 in QApplication::notify(QObject*, QEvent*) () from /usr/lib/qt-3.1/lib/libqt-mt.so.3
#19 0x401ead95 in QEventLoop::activateTimers() () from /usr/lib/qt-3.1/lib/libqt-mt.so.3
#20 0x401c88e8 in QEventLoop::processEvents(unsigned) () from /usr/lib/qt-3.1/lib/libqt-mt.so.3
#21 0x40223cf6 in QEventLoop::enterLoop() () from /usr/lib/qt-3.1/lib/libqt-mt.so.3
#22 0x40223b98 in QEventLoop::exec() () from /usr/lib/qt-3.1/lib/libqt-mt.so.3
#23 0x40210151 in QApplication::exec() () from /usr/lib/qt-3.1/lib/libqt-mt.so.3
#24 0x08067e96 in main (argc=1, argv=0x8258fb0) at main.cpp:702
#25 0x42015574 in __libc_start_main () from /lib/tls/libc.so.6

Zaphod
04-18-2005, 12:17 AM
That looks a lot like you ran out of memory, the heap couldn't satisfy the amount of memory requested, or libstdc++/glibc thought so due to a heap corruption.

Enjoy,
Zaphod (dohpaZ)

purple
04-18-2005, 07:00 AM
From the stack trace, the crash is in when it is newing the memory to hold an oversized packet. The size of the oversized packet is taken off the wire. So I'd imagine that you're running two eq sessions without session tracking and things got crossed up. Packets from one session were getting put into the cache and used for the session seq was watching (note that in the stack trace, addFragment was being called from processCache, so the packet being processed was cached). It just so happened that seq had finished an oversized fragment before and treated the invalid cached packet as the initial fragment of an oversized payload. When it does this, it takes the size of the payload off the wire. If that packet it thinks is correct (because of the arq seq number being correct, even though it is for a different session) isn't actually the start of an oversized payload, the length is just garbage. It was probably some huge number that seq then went ahead and tried to new, which barfed on you.

Of course that is all conjecture, since you didn't really say anything other than the stack trace. The console output is helpful (you should be able to see crossed sessions by multiple SEQStart detected lines), as is any and all information you can give about what is up.

I could put a sanity check on the memory size taken off the wire to stop the crashing I guess, but really, once you at that pont something else is wrong.

Amadeus
04-18-2005, 05:58 PM
I'm running ShowEQ in a debugger now, so hopefully I can get a better idea next time ;)

tanner
04-18-2005, 11:54 PM
Can I ask what version of gcc?

And if it's build from cvs or a package?

If a package, was it debian?

Got a bug in today on the debian package with a similar bt and I'm trying collect as much info as I can.

Amadeus
04-19-2005, 01:58 PM
ok ....come to find out most of my 'crashes' aren't producing core files ;)

Anyway, here is what I'm seeing during some of my more recent crashes:



Info: Your player's id is 1619
Info: SpellItem 'Savage Roots' finished.
Debug: SEQ: Giving up on finding arq 0119 in stream zone-client cache,
skipping!
Debug: SEQ: Giving up on finding arq 011a in stream zone-client cache,
skipping!
Debug: SEQ: Giving up on finding arq 011b in stream zone-client cache,
skipping!
Debug: SEQ: Giving up on finding arq 011c in stream zone-client cache,
skipping!
Debug: SEQ: Giving up on finding arq 011d in stream zone-client cache,
skipping!
Debug: SEQ: Giving up on finding arq 011e in stream zone-client cache,
skipping!
Debug: SEQ: Giving up on finding arq 011f in stream zone-client cache,
skipping!
Debug: SEQ: Giving up on finding arq 0120 in stream zone-client cache,
skipping!
Debug: SEQ: Giving up on finding arq 0121 in stream zone-client cache,
skipping!
Debug: SEQ: Giving up on finding arq 0122 in stream zone-client cache,
skipping!
Debug: SEQ: Giving up on finding arq 0123 in stream zone-client cache,
skipping!
Warning: !!!! EQPacketFragmentSequence::addFragment(): buffer overflow
adding in new fragment to
buffer with seq 018b on stream 3, opcode 53f6. Buffer is size 83619 and
has been filled up to 83321, but tried to add 505 more!

Program exited with code 0377.

purple
04-19-2005, 02:31 PM
There are threads on this already.

Turn session tracking on if you do not have it on. If that doesn't help, increase arqSeqGiveUp to 512 or higher. Don't go higher than 1024. If that doesn't help, increase your socket receive buffer size.

QuerySEQ
04-19-2005, 05:31 PM
I had something similar happen, when one of the kids loaded up Call of Duty on another computer, for some reason, I got flooded and it killed SEQ.

What I ended up having to do, is place a 10/100/1000 switch in there, ( a Zyxel 3024 ) that I had laying around collecting dust, and I just mirrored the 2 ports that my EQ PC and SEQ were on.

Looks like either another EQ session or something else is filling up your arq.

Amadeus
04-19-2005, 08:45 PM
Strange though ...I used ShowEQ for years with this same setup and never had a problem before. Oh well.

Cryonic
04-21-2005, 12:01 AM
SOE has changed their network protocol in recent months, so code has changed in SEQ to accomodate this. Hence differences in the way it works against multiple sessions and potentially against UDP traffic in general.

purple
04-21-2005, 06:56 AM
The main sticking point being they moved from application-level compression to protocol-level compression. This means instead of less packets which are bigger, there are more packets which are smaller. This stresses linux's pf_packet out during times of high throughput (i.e. zoning).

Turning on session tracking helps to cut down on the time packets spend in the receive buffer. Increasing arqSeqGiveUp is a natural consequence of going to more packets required to zone (I've seen valid streams that have been 500 arq seq out of order before). Increasing the pf_packet's receive buffers makes it drop packets less when things get behind at the cost of memory.

The new protocol only has sequencing to hold the stream together. There aren't as many protocol hints on the wire as there were previously. This means that dropped packets can destroy the stream. Right now, seq doesn't do much to try to minimize the effect of dropped packets because if you're getting dropped packets, it's bad and you should really fix the root of the problem, not try to work around it.

Dropped packets manifest themselves as "SEQ: Giving up on finding ..." messages. If the missing packets are parts of an oversized payload, fragment processing will more than likely screw up next, manifesting itself as a "buffer overflow adding in new fragment" message. Now you've clobbered memory and are just waiting for the segfault.

I could make it so that the fragmentation process just throws up its arms and makes you zone again to repick things up. I could try to detect the next beginning of fragment by waiting for a non-fragmented protocol opcode, then picking the stream back up from there. But both of those still mean seq is borked. If the oversized payload we skip is the zone spawns, seq is gonna be pretty useless to you till you zone anyways.

As it stands, I've just been doing other things, since turning on session tracking, upping arqSeqGiveUp and if necessary upping your receive buffers usually fixes the problem.