PDA

View Full Version : Seg faults



Cryonic
08-17-2002, 07:26 PM
Had a little trouble with the system that I use SEQ on. Two of the 3 drives in my RAID5 container were marked dead. I forced them back online and then forced a consistency check of the container and then during boot had to manually run fsck on the partitions held there. Since then (even after rebuilding SEQ from a fresh checkout) it seg faults during zone. I have gone through and verified the rpms on the system (QT was built from a src.rpm using gcc3)

System:
Redhat 7.2
Quad PPro 200
512MB RAM
18GB RAID5 container (3 9GB SCSI drives).

Even though I have ulimit set to unlimited, no core files are generated by the seg fault.

Trying to figure out what else I can do to try and fix this. Verified the libEQ.a md5sum and tried a fresh co from CVS. Still no luck. Even tried downgrading a version (in case this is a SEQ problem and not a system problem).

Dedpoet
08-18-2002, 04:20 PM
I know you're a regular, Cryonic, so excuse me if this seems obvious, but did you do a "make -f Makefile.dist"? Did you try removing the entire seq directory and doing the checkout, or did you just checkout over the top?

Is your other data on those drives accessible? What if you checkout seq to a directory on another filesystem that is not on your array?

fryfrog
08-18-2002, 06:18 PM
raid controllers usually don't mark drives as bad if there isn't SOMETHING wrong with them. have you tried replacing the drives (if that is an option) or re-creating the container and re-installing? 2 drives failing in a raid 5 of 3 drives would kinda suck :(

did the fsck find a butt load of errors?

if you run fsck after running it once, does it continue to find errors?

Cryonic
08-18-2002, 08:09 PM
I haven't replaced the drives and as far as I can tell they were marked dead by the controller most likely due to overheating in the system. I can't afford to replace the 2 9GB 10K drives and since fsck only found about 10 errors and none of the rest of the programs that are on those partitions are misbehaving has led me to believe that the issue is somewhere within or around the SEQ files.

My standard method of rebuilding:

cvs -z3 update (or delete directory and cvs co)
make -f Makefile.dist && ./configure && make -j4

wait about 5min then su - and make install.

The co from cvs goes into the home partition which is down on a RAID5 container made up of 3 4GB drives. The system belongs to my Dad's company and he wasn't using it, so he brought it home for me to beat on.

I guess I could wipe and reinstall, but don't want to do that as the rest of the system appears fine (verifying the rpms with those on cds only found that config files had changed, no binaries). Data on the partitions probably survived because of a combo of RAID5 and ext3 (Xor and journalling of all data in the container).

P.S. Just noticed that SEQ wasn't able to cp the showeqitemdbtool to usr/local/bin, but this time it did. Judging by when it segfaulted, I'm guessing that the SEQ binary uses this tool while it is running (seg faults right after receiving character data and when I looted stuff). Guess this could've been something to note earlier.

Nope, still segfaults. Here's what happens:

Zone: EntryCode: Client
EQPacket::dispatchZoneData():CharProfileCode:Not Decoded
Line 109 in map '/usr/local/share/showeq/Cabeast.map' has X and Y coordinates with no Z!
M Line 109 in map '/usr/local/share/showeq/Cabeast.map' has fewer points than specified!
Loaded map: '/usr/local/share/showeq/Cabeast.map'
Zone: Zoning, Please Wait... (Zone: 'cabeast')
No Zone Specific filter file '/usr/local/share/showeq/filters_cabeast.conf'.
Loading default '/usr/local/share/showeq/filters.conf'.
Zone: EntryCode: Server, Zone: cabeast
TIME: 17:07 12/01/3177
EQ EPOCH OCCURRED AT 771472900 SECONDS POST UNIX EPOCH
CPlayerItems: count=105 size=37594 packetsize=358 expsize=358
Segmentation fault

fee
08-19-2002, 05:37 AM
Just for shitz, try doing make -j2 instead of j4.

fee

Cryonic
08-25-2002, 09:23 PM
OK, wasn't the make option (I have 4 cpus so I use them all when building stuff :)). Manually wiped out the copies in /usr/local/bin and again wiped out the /usr/local/share/showeq directory (in case of a corrupted conf file).

Then just ran make install in the cvs download directory. now it works and I'm again configing

P.S. I just got back to this problem after so long since my Game system was tied up with other tasks on a different connection that SEQ can't monitor (and I don't feel like trying to use UDPecho).