MacOS X, with Redundant Slow File Databases!

February 11, 2009

I have a couple of PowerBook G4 laptops that are now running Leopard. I keep them closed, in sleep mode, most of the time, often for days at a time, as I’m doing most of my work on faster Macs now. When I do open them up to do something, I often find that they are slowed to a crawl by a “find” process madly searching the disk and using most of the CPU power for the next hour or so. Just when I want to use the computer, it’s too busy to be usable.

What’s happening? I discovered that Leopard updates the “locate” database in its weekly cron script. For a computer that’s on most of the time, that generally happens when it’s idle and I’m not around to care if the computer is slow. If it misses that time because the computer wasn’t on, it runs the job as soon as it wakes up. Right when I want to use it.

So, in addition to Spotlight hogging up the computer, Leopard builds a redundant, Unix-style file database, too. Yes, I was involved in writing that stuff for GNU/Linux, but on Macs I almost never want to run “locate”. You’d think Apple would rewrite it as a Spotlight front-end.

On Mac laptops (excuse me, notebooks), I now edit /etc/defaults/periodic.conf and set

weekly_locate_enable=”NO”

On the slower ones, I also turn Spotlight off completely (I think), by running (with sudo) the commands I found in this tip:

launchctl unload /System/Library/LaunchDaemons/com.apple.metadata.mds.plist
launchctl unload -w /System/Library/LaunchDaemons/com.apple.metadata.mds.plist

I use EasyFind if I really need to find a file. It produces more usable results than Spotlight does, anyway.

Now my laptops have enough spare CPU time for me to use them again. Thanks, Leopard.

Macs Needing Unix Network Geekery

February 9, 2009

Several years ago, I noticed that SMB file sharing between Macs (running 10.3 Panther at that time, I think) and Windows XP was a lot slower than it should have been. Copying a file took several times as long as between two PCs on the same 100 megabit LAN. Some research turned up the fact that the MacOS X default network parameters are suboptimal, at least when talking to Windows XP. The fix is to, in Terminal with sudo, create the file /etc/sysctl.conf and put some tweaked settings in it.

The same problem exists in Leopard. The sysctl settings to fix it are slightly different for Leopard and Gigabit networks. Here are some explanations. Here is my sysctl.conf for Leopard and Snow Leopard; omit the maxsockbuf line in Lion and later, and you need only the first two lines in Mavericks (I think) and later because Apple changed the defaults to these settings:

net.inet.tcp.delayed_ack=0
net.inet.tcp.mssdflt=1440
kern.ipc.maxsockbuf=500000
net.inet.tcp.sendspace=250000
net.inet.tcp.recvspace=250000

I also got errors when on a Windows XP client trying to copy files from an OS X share. Windows says it can’t read the source file. Going over to the Mac and copying the same files onto a shared folder on the PC works. Some Googling revealed that there’s a bug in the version of Samba that ships with Leopard. It doesn’t properly support extended attributes (an alternate data stream). I don’t need those anyway, so the fix is to turn off the buggy feature unless it gets fixed in a future release. Here’s the diff:

--- /etc/smb.conf	2009/01/04 22:39:52	1.1
+++ /etc/smb.conf	2009/02/08 14:20:50
@@ -44,7 +44,7 @@
     display charset = UTF-8-MAC
     dos charset = 437

-    vfs objects = darwinacl,darwin_streams
+    vfs objects = darwinacl

     ; Don't become a master browser unless absolutely necessary.
     os level = 2
@@ -56,8 +56,8 @@
     use sendfile = yes

     ; The darwin_streams module gives us named streams support.
-    stream support = yes
-    ea support = yes
+    stream support = no
+    ea support = no

     ; Enable locking coherency with AFP.
     darwin_streams:brlm = yes

In Snow Leopard (10.6.6), the changes needed are as follows:

--- /etc/smb.conf	2010/01/22 00:04:17	1.4
+++ /etc/smb.conf	2010/04/20 13:14:28
@@ -44,7 +44,7 @@
     display charset = UTF-8
     dos charset = 437
 
-    vfs objects = notify_kqueue,darwinacl,darwin_streams
+    vfs objects = notify_kqueue,darwinacl
 
     ; Don't become a master browser unless absolutely necessary.
     os level = 2
@@ -58,10 +58,12 @@
     mangled names = no
     stat cache = no
     wide links = no
+    ; Preserve performance.
+    getwd cache = yes
 
     ; The darwin_streams module gives us named streams support.
-    stream support = yes
-    ea support = yes
+    stream support = no
+    ea support = no
 
     ; Enable locking coherency with AFP.
     darwin_streams:brlm = yes

Restarting the Mac is the easiest way to make these changes take effect.

Lion (10.7) and later use smbd instead of Samba and don’t have this configuration file.

The Mac lover in me is annoyed that Apple ships poor defaults for this important function. How much do they care about Windows file sharing? The Unix geek in me is glad that the free software underpinnings of OS X are configurable enough that I can fix them by editing a couple of text files!

And if you experience a delay of several seconds when connecting to a Windows file share from a Mac, e.g. using “Go->Connect to Server”, make sure to use the full name of the Windows server. On our Active Directory network at the office, when I connected using the form “smb://servername/sharename”, there was about a 6-second delay before the share mounted. When I switched to the form “smb://servername.dom.ain/sharename”, it went down to under a second to connect.

Fast Searching Is Slow

September 15, 2008

Whenever I get a new Apple or Microsoft OS release, I spend a couple of days finding out how to turn off most of the new features, because each release has a usable RAM requirement about twice as big as the previous one.

A couple of weeks ago I upgraded some Macs in our studio from Tiger to Leopard. We have around 40 Firewire hard drives with audio and video files on them, and Leopard wants to re-index them all for Spotlight. I decided to let it go ahead and plugged one drive into each of several Macs. After 3-4 hours of using 100% CPU and showing progress bars that alternated between estimates like 30 hours or 60 hours remaining, and a barber pole progress bar saying it was estimating the time it would take (again), I’d had enough. I never need Spotlight on those drives anyway. What was it trying to do, full-text index my audio and video files?

I found out that I can run the command “touch .metadata_never_index” on the root of each drive to stop this nonsense. Doing that on all 40 drives took maybe 20 minutes, and now I can work again.

This reminds me of the early ’90s when I was maintaining GNU find. The “fast find” code from BSD find was in the public domain, and had been factored out by POSIX.2 into a separate locate command, so I added that to the GNU findutils distribution. In doing so, I refactored James Woods’ monolithic code into coherent functions with meaningful variable names. That allowed me to figure out what the code was doing and document the database file format (and change it to be 8 bit clean).

Richard Stallman (leader of the GNU Project) thought the separate locate program was an inelegant kludge, because it wasn’t guaranteed to produce correct (up to date) results, depending on what had changed since the last time the updatedb command had been run to walk the file system and update the locate database. So he added an item to the GNU task list to integrate the locate code properly into find, to make the database usage a transparent optimization for find, which would fall back on brute force file system traversal for whatever directory trees there wasn’t an up to date locate database (based on time stamps).

After some months, a volunteer named Tim actually submitted a modified version of GNU find where he had done that. Unfortunately, in the meantime I had made some other major changes to GNU find, so Tim’s patches no longer applied. Also, I wasn’t sure I could prove his code to be correct. I kept thinking of edge cases and issues with network mounted file systems that complicated the problem. I integrated some of Tim’s optimizations, but the locate database integration into find was never finished. I always felt a little sad about that.

It was gratifying years later to see Apple, Microsoft, and Google bring indexed disk searching to the public finally, hopefully mostly correctly. (And now including search on file contents as well as attributes.) I just wish the indexing wasn’t such a pig.


Design a site like this with WordPress.com
Get started