netcurmudgeon (netcurmudgeon) wrote,

  • Mood:
  • Music:

Do I get points for not killing him?

Last week (during the BTOP death-march), one of my coworkers -- the GIS manager, on whose good work much of our application depended -- came to me with the complaint that when he and his staff person opened a project in ArcMAP for the first time each day it was slow as hell molasses, but that subsequent opens were quick as usual. The problem started three weeks ago says he, and he intimated that it might be a "local network problem" because none of their users elsewhere were complaining... *

So, over the course of a week we spent a bunch of time testing and troubleshooting. The good news is that the problem was generally reproducible, but our testing results were just ... odd. And, they sort of supported the idea of a network-level problem. However, my gut was telling me that it was a client or server issue.

The more I dug into the network, the stronger my hunch got. Thanks to many dinner-table conversations with taichigeek, I have some knowledge of how the inner workings of a database server, well, work. Yesterday afternoon I had my colleague and his minion perform the following test: go into ArcMAP and wait through the very long project open, then reboot and immediately dive right back into the same ArcMAP project. The result? Prior to previous tests where the interval between rebooting and starting ArcMAP was a lot longer,** this time they were able to jump into ArcMAP and load their projects at "normal" speed. Ah ha!

That result told me that the problem was not client side. The arrows were now pointing firmly at their ArcGIS database server.

This box is a hefty machine -- twin dual-core 3GHz Xeon (Dempsy) processors with 8GB RAM and 1.4 TB of usable disk -- running a 400GB+ database packed with geo-coded data and high-resolution 'pictometry' (very sharp aerial photos). I spent some time this morning staring at perfmon and rummaging through the event logs. I should have started here a week ago.

The event log was full of entries about SQL server timeouts trying to get buffers and latches, and very long instances (e.g. 680,000 ms - ~11 minutes) of trying to grow the transaction log for the GIS database. The machine needed the services of a competent DBA.

After lunch I went to pay my coworker a visit. His minion greeted me with "[He] has something to fess up to." Oh? He told me that mid-morning he had decided to take a look at the SQL database, and discovered that it had been a wee bit longer than he had thought since the last time he purged the database transaction log. How big was the log? Three hundred eighty gigabytes. He rather circumspectly admitted that their applications had been running fine ever since he pruned the log.

He also told me that he had called ESRI support and found out how to configure the transaction log so that it would never grow beyond 2GB (old entries would just be pushed out). And that is probably why I didn't kill him.

* In the final analysis this probably has a lot more to do with our to GIS people starting their days at 7:30 AM and most other city workers starting at 8:30 or 9:00.
** Due to things like shifting subnets, moving connections from one switch to another, etc.
Tags: geeking, gis, no kill i, sql, work

  • Saved by the Dell

    In the past couple of years Dell made sealed keyboards standard on the Latitude line. This makes them very spill resistant, as I discovered last…

  • Geeking along...

    Poking at several free-ware / share-ware network mapping tools tonight. CartoReso is a loss LanTopolog is at a loss with large switches and ring…

  • Progress, progress...

    Found some time today to get SpamAssassin installed on my new mail server. I did a totally default install, so we'll see what sort of tuning I have…

  • Post a new comment


    default userpic

    Your reply will be screened

    Your IP address will be recorded 

    When you submit the form an invisible reCAPTCHA check will be performed.
    You must follow the Privacy Policy and Google Terms of use.