Solaris 10 Kernel Corruption

Last week we wanted to see how much amps our Sun E4500 was using. Its got 8 400Mhz Sparc cpu’s and 8GB RAM. I had to should down the server and put a device between it to measure the power. Turned out that he was using about 3.5 Ampere when just being idle.

After removing the device and rebooting the server, this server didn’t come up anymore and gave the following error:

Boot device: sol10  File and args:
not found: rtt_ctx_end
not found: rtt_ctx_end
not found: rtt_ctx_end
not found: rtt_ctx_end
not found: rtt_ctx_start
not found: rtt_ctx_start
not found: rtt_ctx_start
not found: rtt_ctx_start
do_relocations: /platform/sun4u/kernel/cpu/sparcv9/SUNW,UltraSPARC-II do_relocate failed
krtld: error during initial load/link phase
panic - boot: exitto64 returned from client program
Program terminated

So it seems the kernel got a corrupted file (/platform/sun4u/kernel/cpu/sparcv9/
SUNW,UltraSPARC-II). I booted from CD and checked the above file. It did excist. I checked witch checksum and compared it to a file from another sparc server. It seemed that the checksum was different but the filesize the same, so I copied it from the other sparc server and replaced this file, however it still didn’t work.

In the end I ran an install from cdrom and choose an upgrade installation from the install menu. The upgrade took about 5 hours but fixed everything. All my configurations kepped unchanged (SunRay Server, Metadb’s etc.). After the upgrade everyone was able to run on this server again.

Leave a Reply