Just a heads up — this blog post has nothing to do with the coordinator pattern.

Those that have been following my blog for a long time might remember an old post, Cache Me If You Can. The idea behind the post was simple: if you just want caching, eschew Core Data in favor of something simple: a file on disk that represents a single NSCoding object. Multiple types of things you want to store? Multiple files on disk.

That concept found its way into the Backchannel SDK, as BAKCache, and has sort of followed me around ever since. It found its way into Swift, learned what a generic was, got renamed ObjectStorage, figured out how to handle user data as well as caches, and now supports Codable instead of NSCoding. You can find that version in this gist.

Recently, I’d found myself reading the object, mutating it, and then writing it back. Once I wrote that code a few times, I started noticing a pattern, and decided to formalize it in a method on the class:

func mutatingObject(defaultValue: T, _ block: @escaping (inout T) -> Void) {
    var object = self.fetchObject() ?? defaultValue
    block(&object)
    self.save(object: object)
}

However, there’s a problem revealed by this method. The class’s design encourages you to create lots of instances of ObjectStorage wherever you need them, and thus allows you to read and write to the same file from lots of different places. What happens when two instances read at the same time, both mutate the object in question, and try to write it back to the file? Even if the read and write were individually perfectly atomic, the second instance to write would overwrite the mutation from the first!

Access to the file needs to be gated somehow, and if one object is in the process of mutating, no other object should be allowed to read until that mutation is complete.

Foundation provides a handy API for this, called NSFileCoordinator, which, along with a protocol called NSFilePresenter, ensures that any time a file is updated on the disk, the NSFilePresenter object can be notified about it, and update itself accordingly. This coordination works across processes, which is impressive, and is what enables features like Xcode prompting you to reopen the Xcode project when there’s a change from a git reset or similar.

NSFileCoordinator is indispensable in a few cases:

  1. Since Mac apps deal with files being edited by multiple apps at once, almost all document-based Mac apps use it. It provides the underpinning for NSDocument.
  2. Extensions on iOS and the Mac are run in different process and may be mutating the same files at the same time.
  3. Syncing with cloud services requires the ability to handle updates to files while they are open, as well as special dispensation for reading files for the purpose of uploading them.

For our use case, we don’t really fall into any of the buckets where NSFileCoordinator is crucial — our files are all in our sandbox so no one else is allowed to touch them, and we don’t have an extension. However, NSFileCoordinator can help with file access coordination within the same process as well, so it can potentially solve our problem.

The documentation for NSFileCoordinator and NSFilePresenter are sadly not great. Here’s the gist: NSFilePresenter is the protocol through which the system informs you about what changes are about to happen to your file and how to react to them. NSFileCoordinator, on the other hand, does two things: first, it “coordinates” access to files at URLs, ensuring that two chunks of code don’t modify the data at some URL at the same time; second, it informs any active presenters (across processes!) about upcoming changes to files. We don’t care much about the latter, but the former is relevant to our work here.

Let’s add coordination to the save(object:) method.

func save(object: T) {

You need to keep track of a mutable error, which reports if another file presenter caused an error while it was trying to prepare for the execution of the block.

    var error: NSError?

Next, you create a file coordinator specific to the operation. You can also pass the NSFilePresenter responsible for this change to the NSFileCoordinator. As far as I can tell, this is only used to exclude the current file presenter from getting updates about the file. Since we don’t need any of the file presenter callbacks, we can just pass nil for this parameter and not worry about it.

    let coordinator = NSFileCoordinator(filePresenter: nil)

This next line includes a lot of detail. Let’s take a look first and then break it down.

    coordinator.coordinate(writingItemAt: storageURL, options: .forReplacing, error: &error, byAccessor: { url in

The method is passed the URL we care about, as well as our mutable error.

The block returns to you a URL that you can use. It asks you to use this URL, since it might not be the same as the URL you pass in.The reading and writing options are very important. They are the biggest determinant of what URL you’ll get in your block. If you pass something like .forUploading as a reading option, it’ll copy the file to a new place (no doubt using APFS’s copy-on-write feature), and point you to that new location so that you can perform your lengthy upload without blocking any other access to the file.

With that heavy line of code past us, the actual content of the block stays largely the same as before:

do {
    let data = try Data(contentsOf: url)
    let decoder = JSONDecoder()
    result = try decoder.decode(T.self, from: data)
} catch {
    if (error as NSError).code != NSFileReadNoSuchFileError {
        print("Error reading the object: \(error)")
    }
}

I replicated the same pattern across the fetch, mutate, and delete functions as well. You can see that version of the code here.

To test, for example, the mutatingObject code, I hit the file with 10,000 accesses, each incrementing an integer in the stored file by 1. I’ll leave out the cruft of the test, but the bulk of it looked like this:

let limit = 10_000
DispatchQueue.concurrentPerform(iterations: limit, execute: { i in
    objectStorage.mutatingObject(defaultValue: IntHolder(value: 0), { holder in
        holder.value += 1
    })
})

XCTAssertEqual(objectStorage.fetchObject()?.value, limit)

This is where I discovered my problem. Without using NSFileCoordinator, the mutatingObject dropped about 70% of writes. With it, even using the reading and writing options correctly, I was still losing a few dozen writes every 10,000. This was a serious issue. An object storage that loses about half a percent of writes still isn’t reliable enough to use in production. (Apple folks, here’s a radar. It includes a test project with a failing test.)

At this point, I started thinking about what I was actually trying to do, and whether NSFileCoordinator was really the right API for the job.

The primary use cases for NSFileCoordinator are dealing with document-like files that can change out from underneath you from other processes, and syncing data to iCloud, where data might change out from underneath you from iCloud. Neither of these are really issues that I have, I just need to coordinate writes in a single process to ensure that no data races occur. The fundamental abstraction of “coordinating” reads and writes is sound, I don’t happen to need most of the complexity of NSFileCoordinator.

I started thinking about how NSFileCoordinator had to be structured under the hood to act the way it does. It must have some kind of locking mechanism around each URL. Reads can be concurrent and writes have to be serial. Other than that, I don’t need much.

Based on that, I built my own simple file coordinator. It keeps track of a global mapping of URLs to dispatch queues.

final class FileCoordinator {
            
    private static var queues: [URL: DispatchQueue] = [:]

To make ensure that access of the global queues variable is thread-safe, I need a queue to access the queues. (I heard you like queues.)

    private static let queueAccessQueue = DispatchQueue(label: "queueAccessQueue")

From there, we need to get a queue for any URL. If the queue doesn’t exist yet in the queue map yet, we can make one and stick it in.

    private func queue(for url: URL) -> DispatchQueue {
        return FileCoordinator.queueAccessQueue.sync { () -> DispatchQueue in
            if let queue = FileCoordinator.queues[url] { return queue }
            let queue = DispatchQueue(label: "queue-for-\(url)", attributes: .concurrent)
            FileCoordinator.queues[url] = queue
            return queue
        }
    }

Lastly, we can implement our public interface, and use the reader-writer pattern to ensure that data is accessed safely.

    func coordinateReading<T>(at url: URL, _ block: (URL) throws -> T) rethrows -> T {
        return try queue(for: url).sync {
            return try block(url)
        }
    }
    
    func coordinateWriting(at url: URL, _ block: (URL) throws -> Void) rethrows {
        try queue(for: url).sync(flags: .barrier) {
            try block(url)
        }
    }
}

There are a few benefits of this over Foundation’s NSFileCoordinator. First, it doesn’t drop any writes. I can hammer on this implementation and it safely updates the file on each iteration. Second, I control it. I can change it if I run into any issues, and it has clear room to grow. Lastly, it takes advantage of some great Swift features, like rethrows and using generics to return an object from the inner block and have it be returned from the outer method call.

There are downsides to this approach, too. It naturally won’t work across processes like Apple’s version can, and it won’t have a lot of the optimizations for the different reading and writing options (even though we could add them).

There’s also lots of room to grow here. For example, if you’re touching a lot of URLs, your queues dictionary would grow boundlessly. There are a few approaches to add a limit to that, but I don’t expect it to touch more than 10 or so URLs, so I think it’s fine without the added complexity for now.

Replacing a system class can be a tough call — Apple’s classes tend to be well-written and handle lots of edge cases gracefully. However, when you find that their implementations aren’t serving you, it can be worth it to examine how complicated replacing it will be. In this case, replacing it wasn’t too much code, my test harness could catch bugs, and not relying on already-written code reduced the complexity rather than increasing it.

Update: An engineer at Apple responded to my Radar. It was a configuration error on my part:

The problem here is that the code is using an (apparently useless) NSFilePresenter when initializing the NSFileCoordinator. File Coordination will actually NOT block again other read/write claims made with the same NSFileCoordinator instance or NSFileCoordinator instances made with the same filePresenter or purposeIdentifier.

You most likely just want to use the default initializer for NSFileCoordinator here, or else manually queue up writes internally.

They continued:

Note: if you don’t care about other processes or in-process code outside of your control (i.e. in a framework), you absolutely should NOT use file coordination as a means of mutual exclusion. The cross-process functionality is definitely the main motivation to use File Coordination, but it comes with a relatively high performance (and cognitive) cost in comparison to in-process means.

Given that my in-process FileCoordinator works well, and the Apple engineer so strenously recommended that I don’t use NSFileCoordinator, I’ll probably just keep using the class I made.