?

Log in

No account? Create an account

Previous Entry | Next Entry

Do I get points for not killing him?

Last week (during the BTOP death-march), one of my coworkers -- the GIS manager, on whose good work much of our application depended -- came to me with the complaint that when he and his staff person opened a project in ArcMAP for the first time each day it was slow as hell molasses, but that subsequent opens were quick as usual. The problem started three weeks ago says he, and he intimated that it might be a "local network problem" because none of their users elsewhere were complaining... *

So, over the course of a week we spent a bunch of time testing and troubleshooting. The good news is that the problem was generally reproducible, but our testing results were just ... odd. And, they sort of supported the idea of a network-level problem. However, my gut was telling me that it was a client or server issue.

The more I dug into the network, the stronger my hunch got. Thanks to many dinner-table conversations with taichigeek, I have some knowledge of how the inner workings of a database server, well, work. Yesterday afternoon I had my colleague and his minion perform the following test: go into ArcMAP and wait through the very long project open, then reboot and immediately dive right back into the same ArcMAP project. The result? Prior to previous tests where the interval between rebooting and starting ArcMAP was a lot longer,** this time they were able to jump into ArcMAP and load their projects at "normal" speed. Ah ha!

That result told me that the problem was not client side. The arrows were now pointing firmly at their ArcGIS database server.

This box is a hefty machine -- twin dual-core 3GHz Xeon (Dempsy) processors with 8GB RAM and 1.4 TB of usable disk -- running a 400GB+ database packed with geo-coded data and high-resolution 'pictometry' (very sharp aerial photos). I spent some time this morning staring at perfmon and rummaging through the event logs. I should have started here a week ago.

The event log was full of entries about SQL server timeouts trying to get buffers and latches, and very long instances (e.g. 680,000 ms - ~11 minutes) of trying to grow the transaction log for the GIS database. The machine needed the services of a competent DBA.

After lunch I went to pay my coworker a visit. His minion greeted me with "[He] has something to fess up to." Oh? He told me that mid-morning he had decided to take a look at the SQL database, and discovered that it had been a wee bit longer than he had thought since the last time he purged the database transaction log. How big was the log? Three hundred eighty gigabytes. He rather circumspectly admitted that their applications had been running fine ever since he pruned the log.

He also told me that he had called ESRI support and found out how to configure the transaction log so that it would never grow beyond 2GB (old entries would just be pushed out). And that is probably why I didn't kill him.




* In the final analysis this probably has a lot more to do with our to GIS people starting their days at 7:30 AM and most other city workers starting at 8:30 or 9:00.
** Due to things like shifting subnets, moving connections from one switch to another, etc.

Comments

( 2 comments — Leave a comment )
thecoughlin
Aug. 23rd, 2009 02:14 am (UTC)
probably best.

If stupidity were a capital offense, we'd run out of room to bury them all.
netcurmudgeon
Aug. 25th, 2009 02:35 am (UTC)
Cremation. We can pave the road to Hell with their ashes!
( 2 comments — Leave a comment )

Latest Month

January 2017
S M T W T F S
1234567
891011121314
15161718192021
22232425262728
293031    

Tags

Page Summary

Powered by LiveJournal.com
Designed by Lilia Ahner