September 1, 2015

Many Models

When you first start making an app, your choices about the model layer will impact you the most in the long run. It'll pay dividends to think about these choices up front, instead of being left with the detritus of accidental decisions.

One of these decisions is how to cache your data: whether you'll use Core Data or something simpler.

Another big decision is whether each unique model is represented by one instance or many. This is, in some ways, the crux of the difference between object-oriented and functional/immutable styles. More simply put, it's the class way versus the struct way. Let's examine the differences.

With the object-oriented style, you have living, breathing models. You can send messages to the model, and it can respond, make decisions, perform network requests, and generally act as a first-class citizen in your app.

Making this work requires a design pattern called the Identity Map, which is just a big dictionary that maps each object instance to its identifier. When fetching a model from a store (whether it's a network store or a store persisted on the device), each instance is checked against the Identity Map. Objective-C's flexible initialization makes this really easy.

- (instancetype)initWithObjectID:(id<NSCopying, NSObject>)objectID {
    id existingObject = [[SKModel identityMap] objectForKey:objectID];
    if (existingObject) {
        self = existingObject
        return existingObject;
    }

    self = [super init];
    if (!self) return nil;

    self.objectID = objectID;
    [[SKModel identityMap] setObject:self forKey:objectID];

    return self;
}

Core Data does this for you. If you fetch the same object twice (on the same object context), you will get the same instance back. From the Core Data Documentation:

"Core Data ensures that—in a given managed object context—an entry in a persistent store is associated with only one managed object. The technique is known as uniquing. Without uniquing, you might end up with a context maintaining more than one object to represent a given record."

Given that your stores will return the one instance for each object ID, that model can change out from under any controller-type objects that are holding on to it. Therefore, each controller needs to observe changes on its models and re-render its view to reflect those changes.

There are two cases in which this approach works really well. The first case is when there are many views on screen, some of which point to the same models. Making a change in one view should also be reflected in the other views representing the same object. It also is great with data persisted on-disk. Data on-disk changes frequently and nonatomically. For a todo app, the user might change the due date, which would save the model to disk, then the priority, which would save the model again. Using the same model object makes our program simpler.

The second approach is to use many instances for each individual model. For this approach, each time you fetch from your store (again, either network or persisted), you create a fresh struct (or struct-like object) and use that. When modifying, either ensure isolation of each object by fetching it anew in each place that you'll need it, or by using copy-on-write to create a new instance for each modification.

This approach shines on single-screen platforms, like iOS, where the user is generally looking at one thing at a time. In cases like this, you can lazily refresh data when it comes back on screen, rather than refreshing it greedily. It also shines in systems where the "source of truth" is on a server. Any mutating REST call is an atomic change that will return a response that is fully-formed and fully-validated by the server application. It's also great for immutable data, like tweets. When things can't be edited by the user, it's much safer to use a system that prefers less mutation, like structs.

While using actual Swift structs grants some guarantees about how the thing will be used, it comes with some cost as well. Drew Crawford writes about the "Structs Philosophy™".

The insight here is that doing anything of value involves calling at least one mutating function and as soon as you do that, your function must be mutating, and everything that calls you must be mutating, and it's mutating all the way up, except for the silly leaf nodes like CGRect or your bag of Ints.

Like Drew, I'm not sure I can advise making your whole model layer out of purely Swift structs. For one, any kind of mutation is costly, especially for deeply nested heirarchies of data. Second, structs can't really represent members like images, data, colors, and URLs as value types yet, even though they are often components of models and clearly are values. Using those types requires bridging to Objective-C, which loses a lot of guarantees of immutability and isolation. Lastly, it requires making your models somewhat "dumb". While you can attach functions to structs in Swift, they seem to be more for manipulating the data rather than doing any work, like making an API request that regards the model.

The choice between many instances and one is in your hands. Don't make the decision lightly, however. The path you choose will affect the bedrock of your app, and it will be hard to change later.

August 26, 2015

The Scrivener

There's a special class of scribe that takes requirements from a client and translates them into highly precise, sometimes arcane documents. Few laypeople can understand these documents and the policies they encode, but they nevertheless have a great effect on everyday life.

The domain of the clients and requirements are often different than as the domain of the scribe; the scribe has to learn to efficiently translate the rules of the business she represents into the precise language of the documents.

When these documents are parsed and analyzed, they're done by another body. The meaning drawn from the documents won't necessarily be the same as what the scribe or her clients intended, but the effects of the documents stand, nevertheless.

These scribes are paid a lot of money for their skills, and a lot of people find it frustrating that no one else can do what they do. But ultimately, it's not really possible have a modern, functioning society without them.

I'm talking about lawyers writing contracts, of course.

The parallels between lawyers and programmers are myriad. These parallels are sometimes trivial, like defining variables; every programmer has seen an employment contract that has text like "Widgets LLC, hereinafter referred to as 'the Company'". Defining a variable in this way shortens the contract, prevents redefinition errors, and makes the contract easier to change.

Lawyers also have to flesh out simple ideas (for example, the statement "things that I make") into really complex and precise statements ("all Inventions that I may solely or jointly author, discover, develop, conceive, or reduce to practice during the period of the Relationship").

Contracts are more similar to interpreted code rather than compiled code. The contract can be "statically analyzed" (i.e., read by other lawyers) while it's being written, but its meaning won't be fully determined until it's challenged and taken before a judge.

Judges are much more lenient than code interpreters, so typos and other simple errors aren't commonly held against a lawyer the way they might be with a programmer. Ambiguity, however, provides ammo that opposing counsel can use to change the intent of the contract. That's a runtime error if I've ever heard of one! If only we could give lawyers automated tests.

Much fuss is made about the future of programming, whether it will be text, or some kind of semantic editor, or perhaps something graphical. On the other hand, the future of contracts appears to be a lot more form contracts and contract generation, rather than a dramatic rethinking of their representation. Generated, parameterized contracts parallel either functions or libraries in the programming, although it strains the metaphor a little bit.

It's also easy to imagine the profession of programming going in the same direction as lawyering. Extreme demand and limited supply has caused engineer salaries to skyrocket to the around (and above, in some cases) a lawyer's starting salary. The long hours and benefits (free food if you stay past a certain time!) are also slowly starting resemble big law firms. Programming culture could definitely use the specialized trade schools and a the strict pipeline that law benefits from. Our lawyerly brothers and sisters have already been through what we're going through, and we've got a lot to learn from them.

August 18, 2015

The Back of the Fence

When I write code, my goal is to take as few shortcuts as possible. People often ask me why I bother.

I bother because shortcuts join together like Voltron to tangle your code. I bother because it’s hard enough to read my code when it’s written well. I bother because I never know who’s going to be looking at my code.

Code quality is precisely the proverbial “back of the fence”. Perhaps apocryphally, Steve Jobs was known to care about small, sometimes invisible details. He would fuss over the beauty in a circuit’s design, because his father inspired him to consider the minutiae:

It was important, his father said, to craft the backs of cabinets and fences properly, even though they were hidden. “He loved doing things right. He even cared about the look of the parts you couldn’t see.”

It applies to more than just code and circuit boards. I’ve noticed the best designers that I’ve worked with have meticulously assembled PSD files. Deep hierarchies of organization, consistently-named layers, edges perfectly between pixels rather than on top of them. This attention to detail is reflected in the quality of the designs as well; truly, someone who sweats the details sweats the big picture, too.

The back of the fence never aligns with business metrics; code quality is no exception. It’s a second-order effect which can only affect your bottom line in indirect ways. It’s cheaper to write the code right the first time, rather than having to fix its bugs later. It’s cheaper to work with supple code, code that’s been designed with change in mind.

(There’s one little hack. If you’re working on an open-source project that’ll be used by developers, then your code quality is no longer merely adjacent to cost. You can effortlessly align your business metrics and code.)

In some cases, it’s not possible to draw even the most tenuous connection between the concerns of your business and back-of-the-fence style code quality. For those times, we might call it professional pride.

Joe Cieplinski is a designer who (not by coincidence, I’m sure!) creates extremely neat PSDs. I’ll leave you with some remarks from Joe’s talk at CocoaLove last year:

We don’t design beautiful things hoping that people notice. We design beautiful things knowing that they probably won’t. […] We do design for us. We do design because we want to sleep at night.

August 12, 2015

Bend the Language

Swift brings lots of awesome new features. I'm looking forward to using lots of them, even though it's still a bit early for me to adopt the language. Even if you don't want to or can't adopt the new language yet, you should still be able to get your hands on the new features that they've created, like value types, richer enums, and protocol extensions. In some cases, you might even want to be able to experiment and get access to these features even before Apple announces them.

Luckily for us, a language that’s rich enough lets us approximate these features. In the immortal words of Kanye West, "everything in the world is exactly the same." Let's take a look at a few ways how we can get Swifty features in Objective-C.

Value Types

The first awesome Swift feature we want to examine is the Swift struct and its value semantics. To know how to replicate this feature in something like Objective-C, we need to first figure out how we use it and which parts of the feature we want.

Unfortunately, developing this distinction between inert values and living, breathing instances isn't easy. Via the Swift book:

As a general guideline, consider creating a structure when one or more of these conditions apply:

  • The structure’s primary purpose is to encapsulate a few relatively simple data values.
  • It is reasonable to expect that the encapsulated values will be copied rather than referenced when you assign or pass around an instance of that structure.
  • Any properties stored by the structure are themselves value types, which would also be expected to be copied rather than referenced.
  • The structure does not need to inherit properties or behavior from another existing type.

My general criterion is if you don't care which copy of a thing you have, it's a value.

So, given that description of values, we primarily want to take advantage of their immutability and isolation. (There are some slight performance gains as well, but come on — profile before optimizing.)

For a mutable thing to cause unintentional bugs in your app, it must also be accessed from more than one place; after all, if it’s only every used in one place, it can’t change out from under you. Structs in Swift solve this problem by either being immutable, in which case you can’t change them, or being copy-on-mutate: as soon as you change one, you get a fresh copy that’s yours and yours alone.

To get the same benefit, we need to either preserve either the immutability of our object or its isolation. To preserve isolation in Objective-C, we can conform to <NSCopying> and declare all properties as copy. This is pretty tedious and it’s easy to forget to copy every time you use the object in a new place.

Immutability, on the other hand, is all defined in the class’s interface, and lets us give the class's users hints about how the class should be used.

In The Value of Value Objects, I write about using a pattern called Tiny Types (gist) to get immutable wrappers around values.

@interface Hostname : ValueObject

- (instancetype)initWithString:(NSString *)name;

@property (readonly) NSString *name;
@property (readonly) NSURL *hostnameAsURL;
@property (readonly) Hostname *hostnameByEnsuringSSL;

@end

@implementation Hostname

- (instancetype)initWithString:(NSString *)name {
    return [self initWithBackingObject:name];
}

- (NSString *)name {
    return self.backingObject;
}

- (NSURL *)hostnameAsURL {
    return [NSURL URLWithString:self.name];
}

- (Hostname *)hostnameByEnsuringSSL {
    NSURLComponents *URLComponents = [NSURLComponents componentsWithURL:URL resolvingAgainstBaseURL:YES];
    URLComponents.scheme = @"https";
    return [[Hostname alloc] initWithString:URLComponents.URL.absoluteString];
}

@end

Because this object isn't mutable, you never have to worry about it changing out from underneath you. Even when it does need to be changed, as in -hostnameByEnsuringSSL, it returns a new Hostname object, never changing the existing one.

In the same way that Swift's compiler enforces proper behavior, so too does Objective-C's. With a message like -hostnameByEnsuringSSL, the name and its type signature make it clear that something different is happening, and that you need to handle it in a special way.

This pattern can be extended to allow intialization with multiple "backing objects" rather than just one, but the principles (immutable properties and copy-on-mutate) stay the same.

Rich Enumerations

Swift's enums are a great step forward from C's enumerations. The greatest advantage is that you can associate both data and functions with the enum now. In the old C-style, enumerations were nothing more than some sugar around a number:

typedef enum Shape : NSUInteger {
    Circle,
    Rectangle
} Shape;

To do work with these, you have to pass them into a top-level or free function:

NSInteger calculateArea(Shape s);

Reversing the subject and object like this is harder to read, for some reason, and no one likes it.

Swift allows us to describe create an enum more easily, associate data with it (such as the radius or width and height below). Once we have data, we can add behavior as well:

enum Shape {
    case Circle(Double)
    case Rect(Double, Double)

    func area() -> Double {
        switch self {
        case let Circle(r):
            return pi * r * r
        case let Rect(width, height):
            return width * height
        }
    }
}

Ultimately, though, these enumerations are all "sum types", or tagged unions. They can be in one of many disjoint states. Fortunately for us, sum types can take many forms, and we have access to some of them in Objective-C. If we want a sum type with associated data and functions, just like Swift’s, we can easily get that.

@interface Shape : NSObject

@property NSInteger area;

@end

@implementation Circle : Shape

- (instancetype)initWithRadius:(NSinteger)radius {
    self = [super init];
    self.radius = radius;
    return self;
} 

- (NSInteger)area {
    return M_PI * self.radius * self.radius;
}

@end

Here, instead of an enum called Shape, we have an abstract class called Shape, and several concrete classes that derive from it, defining their own initializers and bringing their own implementations for -area.

There’s a caveat here, which is that this solution uses inheritance. However, this is something I'd call "inheritance with intention". It's small, contained, and out of the way. The polymorphism defined early in the process, allowing it to be designed holistically. It lets other objects operate with less knowledge of how Shape works and minimizes fewer code paths. If it still bothers you, you can get the same "sum type" effect without inheritance by using protocols.

This pattern is discussion with more depth in Replace Enumerations With Polymorphism.

Protocol Extensions

Another big awesome feature in Swift is protocol extensions. I love this one and I feel like I've wanted it in Objective-C forever. At its core, protocol extensions add extra behavior to an abstract set of messages. I've been leaning very heavily on decoration to get the same thing (behavior added to a preexisting thing) when I'm in Objective-C.

Let's look at an example. A data source object helps you map index paths to objects, so that they can be displayed in a table view or collection view. A protocol for a data source might be defined like so:

@protocol DataSourceProtocol

@property NSInteger numberOfSections;
- (NSInteger)numberOfObjectsInSection:(NSInteger)section;
- (id)objectAtIndexPath:(NSIndexPath *)indexPath;

@end

This is the bare minimum we need to implement this protocol. If we were in Swift and we wanted to add any functions whose results were derived from these, we could use protocol extensions. In Objective-C, however, we'll use decoration:

@implementation FullDataSource

- (instancetype)initWithDataSource:(id<DataSourceProtocol>)dataSource { //... }

//forward -numberOfSections, -numberOfObjectsInSection: and -objectAtIndexPath: to self.dataSource

- (NSArray *)allObjects {
    NSMutableArray *allObjects = [NSMutableArray array];
    NSInteger numberOfSections = self.numberOfSections;
    for (NSInteger sectionIndex = 0; sectionIndex < numberOfSections; sectionIndex++) {
        NSInteger numberOfObjectsInSection = [self numberOfObjectsInSection:sectionIndex];
        for (NSInteger objectIndex = 0; objectIndex < numberOfObjectsInSection; objectIndex++) {
            [allObjects addObject:[self objectAtIndexPath:[NSIndexPath indexPathForRow:objectIndex inSection:sectionIndex]]];
        }
    }
    return allObjects;
}

- (NSIndexPath *)indexPathForObject:(id)object {
    for (NSInteger sectionIndex = 0; sectionIndex < self.numberOfSections; sectionIndex++) {
        NSInteger numberOfObjectsInSection = [self numberOfObjectsInSection:sectionIndex];
        for (NSInteger objectIndex = 0; objectIndex < numberOfObjectsInSection; objectIndex++) {
            id object = [self objectAtIndexPath:[NSIndexPath indexPathForRow:objectIndex inSection:sectionIndex];
            if ([object isEqual:object]) {
                return object;
            }
        }
    }
    return nil;
}

@end

By wrapping <DataSourceProtocol> in DataSource, we can add behavior to any data source defined with that protocol, like -allObjects and a reverse lookup method called -indexPathForObject:. We could also add a method to enumerate over the objects with a block, and so on.

Whereas with the enumeration example we had caveats, here we have two major advantages. First, decoration can be done multiple times. If you want to wrap a wrapped thing, you can do that. Second, you gain the ability to change which decorators you are using dynamically. Some <DataSourceProtocol> objects need some decorators, and others need different ones. You can mix and match them however you like.

Conclusions

I've written about all these patterns here with more depth: value objects here, enumerations here, and decoration here, here, here, and here. Yes, Swift allows you to express these ideas much more tersely, and in some cases, prevents you from making errors while expressing them. This is to Swift's benefit. These ideas are more flexible than their specific implementations in Swift, however. As you implement them, you get to decide exactly how they work.

It's also notable that the blog posts describing these patterns came out before their respective Swift counterparts. If you’re willing to bend the traditional uses of the language a little, and you can reap their benefits, and you can do it before everyone else. Don't wait for anyone. Write your own future.

August 6, 2015

There are no mysteries

I've been programming in earnest for about 5 years. When I was just beginning, a lot of the way the computers and programming worked seemed like sheer magic. As I've gotten better at programming, a lot of that opacity has started to fade away. The illusions of abstraction are disappearing. It's a little terrifying that the people who make the foundation of our technology aren't that much smarter than me or you, but mostly this epiphany overjoys me. Nothing is inscrutable!

When I first started making iOS apps, I remember wiring up my IBAction to a button in Interface Builder and thinking "Okay, but how is this method getting called?" If I'm remembering right, part of me thought that it might be some kind of polling loop. I also remember thinking that it couldn't possibly be that stupid. Turns out, it is. NSRunLoop is just a big while loop that spins and makes your whole app go around.

Arrays are easy enough; they're just an unbroken block of memory. With an index for an item, you can figure out what its memory location is: (offset + index*item_size). But how the heck do dictionaries (or hashes, or maps) work? How can they get instant access with a key that isn't a number?

You've got to convert that key into a number, so you can put it into an array. (Turns out, this is one of the things you learn when you get a real computer science degree in a class called "Data Structures". I had a degree-having friend explain it to me.) Take a big array, hash the key into a number (hence, the name "hash"), mod it by the size of the array so it'll fit into one of the array's slots, and insert it into a bucket at that index. When fetching, the key is hashed again, and it should fall into the same spot in the array, and you can get the item from the bucket. It's not magic, even though it certainly feels like it. Here's a good Mike Ash post with code explaining how NSMutableDictionary is written.

The big kahuna: compilers. How do they work? You could go read the daunting Structure and Interpretation of Computer Programs (which I promise I will do one of these days), or read this relatively short (16,000-word) blog post that explains how to write a lexer, parser, and evaluator for small subset of Lisp. It took me about two hours to get through, but in the end I had a pretty good sense for how Lisp (probably the simplest programming language) was parsed and compiled. How do you make this Lisp compile down to machine code? I don't know. Yet.

Richard Feynman, when asked if understanding our world via science removes its magic, responds with a parable about an artist:

He'll hold up a flower and say "look how beautiful it is," and I'll agree. Then he says "I as an artist can see how beautiful this is but you as a scientist take this all apart and it becomes a dull thing," and I think that he's kind of nutty. […] All kinds of interesting questions which the science only adds to the excitement, the mystery and the awe of a flower. It only adds. I don't understand how it subtracts.

Garbage collectors; NSCache; image convolution; UIScrollView; bloom filters: with all of it, there's no voodoo. Even if it feels like a black box that you'll never be able to peer into, it's all just code. No matter how many stupid arcane meaningless equations are on the Wikipedia page, it's simpler than that.

It's all just bits and bytes that are arranged in nice ways that have nice ramifications. You can understand those bits and bytes, and knowing that the computing world has order is essential to understanding that you can affect it and improve it.

July 29, 2015

The Braid of Thought

Warning: This one’s not about programming, but it is programming adjacent.

Meditations on Moloch, which I’ve linked to before, is a great article. Alexander finds a few texts and weaves them together, creating an argument that that binds them all. He wraps each text into that braid until an free-standing argument is borne from each of the separate texts.

Chronology is a harsh master. You read three totally unrelated things at the same time and they start seeming like obviously connected blind-man-and-elephant style groping at different aspects of the same fiendishly-hard-to-express point.

A braid of thought — multiple ideas that come together from varied sources that you happened upon almost by chance — follows.

As We May Think is a classic article from 1945, where Vannevar Bush describes a piece of technology he calls the “memex”.

A memex is a device in which an individual stores all his books, records, and communications, and which is mechanized so that it may be consulted with exceeding speed and flexibility. It is an enlarged intimate supplement to his memory.

Upon reading this, many humans rejoice, seeing a reflection of the memex in the modern web or in Wikipedia. Bush seemed to have predicted the widespread dissemination of knowledge that's taken for granted today. But Bret Victor, when reading Bush’s piece, sees failure in the modern web. He writes about this failure in the Web of Alexandria and its follow up.

The web, of course, took a different approach. A million volumes, yes, but our desks remain empty. Instead, when we summon a volume, we are granted a transient and ephemeral peek at its sole instance, out there somewhere in the world, typically secured within a large institution.

This server/client, truth-lives-in-the-cloud, single-point-of-failure model is so engrained in me that when considering a new product, I don’t even evaluate other protocols for data storage and transmission. But there’s so many other templates we can base our information model off of.

Consider email, where every participant keeps a copy of every single dispatch. Consider Git, where each programmer keeps a copy of every single commit. Consider Bittorrent, where each user hosts only the files they care about.

There are also models we haven’t tried. Some datasets are really small, like my contacts. There’s no reason I couldn’t trust all those contacts to 5 or 6 of my close friends. I doubt they’d mind a few hundred extra kilobytes on their drives. Even If I didn’t trust them, I can just heavily encrypt the contacts before I send them over.

It’s made me realize that there’s so many ways we limit ourselves with technology. Modern startups care about very specific things, like streams and attention and keeping your data. Silicon Valley’s conception of what an app can be is very narrow minded, bounded by the dreams of hockey-stick user growth and a high valuation. Paul Ford explains why these types of companies want that type of data in his post about Ashley Madison.

I’ve never built a translucent database-driven system because none of my clients have ever been the least bit interested. They want names, addresses, credit cards, and the like. But they don’t actually need a lot of that data to build a good web service. They need it for potential marketing purposes.

These connections are yet reinforced by Maciej Cegłowski’s Web Design: The First 100 Years. The connections here are left as an exercise for the reader.

If I’m understanding Victor’s argument correctly, it’s the very structure of the web (combined with a thirst for profit, I would probably add) makes these problems arise. We could restructure things and make the web suck less by default. A pit of success of usability and humanity.

To cap off this little mini-web of interconnectivity, last week Mike Caulfield wrote about taking this idea further in Beyond Conversation. He described how links fit into this world: “Links are made by readers as well as writers,” and that was the moment for me that all these threads wound themselves into a much stronger braid.

We have values: links shouldn’t rot; users should have control of their data; media companies should serve users, and not the other way around. These values are incompatible with the Internet in its current conception, and we can’t build the future we want on top of a foundation that won’t support it.

Caulfield’s general solution is for each user to create her own wiki. A personal wiki has never had much appeal for me, since I don’t have a category of stuff I write that I wouldn't publish here. I don’t really write much privately. However, you could take all the pages I love, all the pages I think are important, all the pages I think are mildly interesting, all the pages I’ve seen, all the conversations I’ve had, and all the pictures I’ve taken and save them on my computer. Make it searchable. Now you’re talking about something I really understand.

Allow me to make links and create associations on top of these documents and this very blog post becomes a lot easier to research and write. Chat logs that have links in them should point to the pages that they link to; those pages should link back to the chat logs. What if there were two documents I loved that both linked to a third document on the web? I’d probably want to read that. What if documents, web pages, and images could cluster around a physical location? I’d probably want to know when I’m near that spot.

It’s hard to solve the link rot problem in the general case, without downloading the whole Internet. Fortunately, I don’t care about most of the Internet; I only care about the stuff I’ve read. We have so much cheap storage available now that it’s almost criminal not to save all the web pages you look at. But we don’t keep them so someone can sell ads to you more easily; we keep them so you can easily find the stuff you liked and cared about and thought about.

The idea of the “outboard brain” is a technique for using computers for what computers are good at, and freeing up your brain for what brains are good at. This tool is an outboard brain for idea generation. ("An enlarged intimate supplement to his memory"!) I think I remember all of the various articles and writings that lead to this blog post, but what if I haven’t? Maybe I’ve forgotten a thread that would pull this braid in a totally different direction. (Oh, yeah. I just remembered: watching the BBC’s Connections definitely put me in the mood to start thinking about how all these things are related. – ed.) I’m really bad at remembering stuff, and I’d love to relegate that responsibility to a tool that’s great at it.

Computers are for people. Let's make them so.

July 22, 2015

Cache Me If You Can

Core Data is a powerful framework. It seems a lot like an ORM, but its advocates are quick to remind you that it's "actually" an object persistence framework. I think that's how they stomach not being able to run arbitrary SQL on their own database.

Needling aside, it's the right choice for a lot of apps. When the user has a set of data that's wholly on the device, reach for Core Data. In a lot of the cases I've worked on recently, however, I've found that Core Data functions as more of a cache for objects that canonically live on a server and present themselves through an API.

For cases like these, Core Data is tremendous overkill. Core Data was designed before the prevalance of web services and APIs. It was intended to represent an object graph in its entirety, rather than a small portion that has been downloaded from a service. Since you don't have the whole dataset, you can't even effectively query against it.

The costs to using Core Data are very high, since it's so complex, and the benefits are pretty minimal. Primarily, these types of apps use it to persist the objects so that they'll work in the subway or load quicker on the next launch. Fortunately, we can write a small amount of code to get this effect without having to conform to Core Data's madness.

Foundation provides a protocol called <NSCoding>, which is as simple and elegant as Core Data isn't. By making your models conform to <NSCoding>, you can easily use NSKeyedArchiver and NSKeyedUnarchiver to save your objects to disk. Many built-in objects, like collection types, already conform to <NSCoding>, so you get those for free.

To actually perform the caching, let's make simple object, called SKCache. Caches can be finicky and cause bugs easily, so I'd like to make it very easy to enable and disable.

@implementation SKCache

static BOOL _enabled = YES;

+ (void)enable {
    _enabled = YES;
}

+ (void)disable {
    _enabled = NO;
}

Caches also need a name:

- (instancetype)initWithName:(NSString *)name {
    self = [super init];
    if (!_enabled || !name) self = nil;
    if (!self) return nil;

    _name = name;

    return self;
}

From that name, they'll figure out where in the file system to save themselves.

- (NSString *)hashedName {
    return [self.name MD5String];
}

- (NSString *)cacheFilename {
    return [self.hashedName stringByAppendingPathExtension:@"cache"];
}

- (NSString *)appCacheDirectory {
    NSArray *searchPath = NSSearchPathForDirectoriesInDomains(NSCachesDirectory, NSUserDomainMask, YES);
    return searchPath.firstObject;
}

- (NSString *)cacheLocation {
    return [[self.appCacheDirectory
     stringByAppendingPathComponent:@"caches"]
     stringByAppendingPathComponent:self.cacheFilename];
}

(Note all the short, simple methods. This object follows the pattern from Graduation.)

Once we have a place to save objects to and fetch objects from, we can easily do that:

- (void)saveObject:(id<NSCoding>)object {
    [NSKeyedArchiver archiveRootObject:object toFile:self.cacheLocation];
}

- (id<NSCoding>)fetchObject {
    return [NSKeyedUnarchiver unarchiveObjectWithFile:self.cacheLocation];
}

@end

This particular type of cache is designed to totally overwrite all of its contents. Blowing it away entirely every time the app gets fresh data ensures there are fewer synchronization bugs. When initializing with a name, you can easily increment the version number when the schema changes or if you accidentally add bad data to it. Since the true data lives on the server, the cache doesn't need to be durable at all. Changing the name of the cache will just leave an extra file in the Caches folder that iOS will clean up when it needs the space.

Once we have a quick and easy way of storing a model object (or array of model objects), we can get to using it. We can set up our cache inside a remote data source.

- (instancetype)init {
    self = [super init];
    if (!self) return nil;

    _fetcher = //set up a fetcher
    _cache = [[SKCache alloc] initWithName:@"com.khanlou.followers?forUser=1234"];
    [self loadFromCache];
    [self fetchData];

    return self;
}

When we first load the data source up, we check the cache for any old content that we can show while we wait for fresh data from the server.

- (void)loadFromCache {
    self.content = [self.cache fetchObject];
    [self informDelegateOfUpdate];
}

Next, we fetch fresh data. When we get it, we can save it in the cache and tell the UI to update via a delegate message.

- (void)fetchData {
    [self.fetcher fetchWithSuccessBlock:^(NSArray *results) {
        self.content = results;
        [self.cache saveObject:self.content];
        [self informDelegateOfUpdate];
    } failureBlock:^(NSError *error) {
        //don't blow away self.content
    }];
}

Lastly, we need to accessors to get at the data for our table view:

- (NSInteger)numberOfSections {
    return 1;
}

- (NSInteger)numberOfObjectsInSection:(NSInteger)sectionIndex {
    return self.content.count;
}

- (id)objectAtIndexPath:(NSIndexPath *)indexPath {
    return self.content[indexPath.row];
}

There's not much else to this technique. It seems too simple, almost trivial and useless, but I've found that it solves the problem very well in practice.

Your app might need a more robust cache; for example, you might need one that can be queried or one that needs to be durable. Core Data may still be right for you. If your app is one of the many that are backed by web services, however, your problems aren't the same problems that Core Data was designed to solve, and you should examine simpler solutions.

July 15, 2015

State Negotiations

Functional programmers talk about two things a lot — avoiding side effects and avoiding state. At first, this seems impossible: how the heck am I supposed to write code without side effects and without state? The whole point of programs are to do stuff and remember things! Avoiding side effects is still something I'm figuring out, but this week, I have some tips and tricks on avoiding state.

I've had the most luck with this approach: don't try to totally avoid state, but to limit it wherever possible.

There are lots of techniques for limiting state, and I'll list a few here. It isn't complete, but I hope that it provides enough a jump start to understand the general pattern.

Understating

The broad strategy here in all of these ideas is to reduce the number of instance variables you have, which simplifies your classes. Let's take a simple example, like a table view controller. If you're not using Apple's built-in UITableViewController, you might have an extra @property UITableView *tableView. This generates an additional instance variable. Your -loadView method might look like this:

- (void)loadView {
    UITableView *tableView = [[UITableView alloc] initWithFrame:CGRectZero style:UITableViewStylePlain];
    tableView.dataSource = self;
    tableView.delegate = self;
    self.tableView = tableView;
    self.view = tableView;
}

While it seems easy enough to keep both of these properties in sync, what's actually going on is that one is acting as a "cache" of sorts for the other. Keeping a cache in sync with the primary is always hard, even when it looks easy up front. Even the best programmers make mistakes. Instead, you can use a little-known feature called covariant return types to redeclare the view property with the correct type:

@property (nonatomic) UITableView *view;

Mark it as @dynamic:

@dynamic view;

And then forward the tableView message to the view property:

- (UITableView *)tableView {
    return self.view;
}

With covariant return types, there's no casting required! The compiler knows what you intend. You're using a computed property and nothing needs to be kept in sync anymore because there's nothing to be kept in sync!

Unite the States

Another way to limit instance variables is with state machines. I've written about state machines here before. Before state machines you might have a mish-mash of properties describing the state of, say, a network request:

@property BOOL isUnsent;
@property BOOL isFetching;
@property BOOL isCompleted;
@property NSError *error;
@property NSArray *results;

The problem is that the "space" of this state is huge, and large swaths of that state space are are invalid. For example, what does it mean if more than one of the BOOL properties are YES? What if none of them are YES? What if isFetching is YES and the error had a value? To solve this problem, you can keep one property around:

@property SKRequestState *state;

This state property can have store values of different types, like SKLoadingState, SKErrorState (which stores the error), and SKCompletedState (which stores the results). You can then make those properties readonly and forward them directly to the state property.

- (BOOL)isUnsent {
    return self.state.isUnsent;
}

- (BOOL)isFetching {
    return self.state.isFetching;
}

- (BOOL)isCompleted {
    return self.state.isCompleted;
}

- (BOOL)error {
    return self.state.error;
}

- (BOOL)results {
    return self.state.results;
}

All states respond to each of those messages, returning nil where necessary. This way, while the external surface of the object is still the same, you'll never fail to keep the class in internal synchrony.

If you've got a primitive, like a BOOL or an NSInteger in a class, you can ask yourself: is this really just a number, or do I need to wrap it in something that ascribes meaning to it?

If there are two or more primitives in your class, ask: are they unrelated, or should I formalize their relationship in code?

If a property is nil for part of the object's lifecycle, ask: what meaning is hidden in the nothingness of this property, and how can I make that meaning more obvious to the reader of my code?

Using state machines helps enforce honesty about what's complicated.

The Null Hypothesis

Imagine a presenter that downloads a user object from the server and exposes an interface for displaying that user. Sometimes, the presenter encounters an error and displays a different message for the user's name.

@implementation SKUserPresenter

- (void)fetchUser {
    [self.fetcher fetchWithSuccessBlock:^(SKUser *user) {
        self.user = user;
    } failureBlock:^(NSError *error) {
        self.userFetchError = error;
    }];
}

- (NSString *)name {
    if  (!self.userFetchError) {
        return self.user.name;
    }
    return @"User not found.";
}

//...

@end

This object is now keeping an extra thing (userFetchError) around just so it can handle a special case. The current intention of this code is that either user or userFetchError can have values, but never both. However, you aren't constrained by the design of the class to ensure this invariant is maintained.

Another member of your team, perhaps future-you, could easily cause user and userFetchError to both have values. To solve this problem, we can constrain our instance variables better by using the Null Object pattern.

@implementation SKUserPresenter

- (void)fetchUser {
    [self.fetcher fetchWithSuccessBlock:^(SKUser *user) {
        self.user = user;
    } failureBlock:^(NSError *error) {
        self.user = [SKMissingUser new];
    }];
}

- (NSString *)name {
    return self.user.name;
}

//...

@end

If you wanted to go the extra mile, you could initialize the SKMissingUser with the error, and have it pull its message from the type of error. The broad pattern is the same, however: fewer instance variables, more simplicity.

The Fourth Estate

When limiting state, I find the following rules of thumb to helpful.

  1. Using fewer instance variables is better than using more.
  2. Removing reliance on primitives gives meaning to an object's state as it is exposed internally.
  3. Using computed or readonly properties gives meaning to an object's state as it is exposed externally.
July 8, 2015

Templating Update

I've put my new networking code in a small new codebase, and as I mentioned last week. There were a few extras that I built for it that I wanted to mention. Writing in Swift for a week was nice, but it's back to home territory, where I can be extra effective.

Multipart Requests

One of the first roadblocks I hit when using the new networking library was multipart requests. A multipart request is just a request with a special body that can have text and values mixed in with raw data. It's ideal for uploading images and other big files. This document at the W3C explains the standard in an effective and accessible way.

To support multipart requests, I needed a new request builder, and I needed SKSendableRequest to accept an injected request builder, so I changed its primary initializer to accept a request builder and changed its old initializer to be a convenience method.

@implementation SKSendableRequest

- (instancetype)initWithRequestBuilder:(id<SKRequestBuilder>)requestBuilder {
    self = [super init];
    if (!self) return nil;

    _requestBuilder = requestBuilder;

    return self;
}

- (instancetype)initWithRequestTemplate:(id<SKRequestTemplate>)template {
    SKRequestBuilder *requestBuilder = [[SKRequestBuilder alloc] initWithRequestTemplate:template];
    return [self initWithRequestBuilder:requestBuilder];
}

Multipart requests also send a boundary as part of the Content-Type; this boundary defines where each "part" ends and the next begins. This is sent up in the header of the request itself.

- (NSDictionary *)headers {
    return @{
             @"Accept": @"application/json",
             @"Content-Type": self.contentType,
             };
}

- (NSString *)contentType {
    return [NSString stringWithFormat:@"multipart/form-data; boundary=\"%@\"", self.boundary];
}

- (NSString *)boundary {
    return //a random string here
}

This boundary property was also exposed as an optional part of the <SKRequestTemplate> protocol so that the request builder could access it. (It's also added to the safe template as well.) Request templates that don't need can ignore it. For the multipart request builder, it's initialized with a template and some data.

@implementation SKMultipartRequestBuilder <SKRequestBuilder>

- (instancetype)initWithRequestTemplate:(id<SKRequestTemplate>)template data:(NSData *)data {
    self = [super init];
    if (!self) return nil;

    _safeTemplate = [[SKSafeRequestTemplate alloc] initWithTemplate:template];
    _data = data;

    return self;
}

Most of this request builder is similar to the normal request builder, but building the HTTP body is different. This particular multipart request builder is designed for only one "part", but it could be generalized to accept multiple parts.

- (NSData *)HTTPBody {
    NSMutableData *body = [NSMutableData data];
    [body appendData:[self.bodyBeforePart dataUsingEncoding:NSUTF8StringEncoding]];
    [body appendData:self.data];
    [body appendData:[self.bodyAfterPart dataUsingEncoding:NSUTF8StringEncoding]];
    return body;
}

- (NSString *)bodyBeforePart {
    NSMutableString *string = [NSMutableString string];
    [string appendFormat:@"--%@\r\n", self.boundary];
    [string appendFormat:@"Content-Disposition: form-data; name=\"attachment\"; filename=\"filename\"\r\n"];
    [string appendFormat:@"Content-Type: %@\r\n\r\n", @"image/jpeg"];
    return string;
}

- (NSString *)bodyAfterPart {
    return [NSString stringWithFormat:@"\r\n--%@--\r\n", self.boundary];
}

- (NSString *)boundary {
    return self.safeTemplate.boundary;
}

Separating out request construction from request sending early in the process of designing this networking code made it obvious exactly where multipart request construction fits into the structure of the library.

Paginatable Requests

Some requests are paginatable, which means they have a page parameter that increments by 1. Let's imagine a endpoint that gets the followers of a user.

@implementation SKFollowersRequest <SKRequestTemplate>

- (instancetype)initWithUserID:(NSString *)userID {
    self = [super init];
    if (!self) return nil;

    _userID = userID;

    return self;
}

- (NSURL *)baseURL {
    return [NSURL URLWithString:@"api.khanlou.com"];
}

- (NSDictionary *)parameters {
    return @{};
}

- (NSString *)path {
    return [NSString stringWithFormat:@"users/%@/followers", self.userID];
}

@end

Right now, this request is awesome. It's initialized with everything it needs, it can't be modified, and it's easy to read and process. If we wanted to paginate this request, we'd have to add an extra parameter called page to the request. We'd need either a mutable property called page that we could update, or we include it in the initializer, preventing the mutation.

What if we had three more endpoints that were all paginated in the same way? Now we've got some duplication, and we'd love handle pagination in some kind of generic way.

Here is the part of the blog post where I pretend to propose using inheritance to solve the problem, and then describe why it's a bad idea. But inheritance doesn't even make sense here. We could subclass all our paginatable requests from one class, but then we'd have to initialize with the right page and merge each request's parameters with its superclasses. It wouldn't actually save us anything.

Ideally, the request template wouldn't even know that it was paginatable. To that end, let's use decoration to add a page parameter to any request we want. Let's start with an initializer that takes a template and a page. Note that SKPaginatableRequest takes a template but also conforms to templateness. This is the pattern from Nestable.

@implementation SKPaginatableRequest <SKRequestTemplate>

- (instancetype)initWithRequestTemplate:(id<SKRequestTemplate>)template page:(NSInteger)page {
    self = [super init];
    if (!self) return nil;

    _template = template;
    _page = page;

    return self;
}

Let's include a convenience initializer for the first page. We're not monsters.

- (instancetype)initWithRequestTemplate:(id<SKRequestTemplate>)template {
    return [self initWithRequestTemplate:template page:1];
}

If you'll remember, the SKRequestTemplate protocol requires a baseURL, so we have to add that to keep the compiler happy.

- (NSURL *)baseURL {
    return self.template.baseURL;
}

For all of the other parameters, like method, path, etc, they're optional, so we don't need to supply implementations for them. Objective-C allows us to forward any messages to a "friend", with -forwardingTargetForSelector:. We know that SKSafeRequestTemplate wraps each property with -respondsToSelector: checks, so we just have to make sure to update the implementation of -respondsToSelector: as well.

- (BOOL)respondsToSelector:(SEL)aSelector {
    return [super respondsToSelector:aSelector] || [self.template respondsToSelector:aSelector];
}

If the runtime can't find a particular implementation in this class, it should just go to the template. If a method can't be found there either, it will just blow up, as expected.

- (id)forwardingTargetForSelector:(SEL)aSelector {
    return self.template;
}

For the parameters property, we want to do something special. We'll get the page number, and add all the parameters from the request itself. Note the order: we won't overwrite anything from the parameters the original request gives us. Instead, we allow the original request to overwrite our parameters.

- (NSDictionary *)parameters {
    NSMutableDictionary *dictionary = [@{@"page": self.pageAsString} mutableCopy];
    if ([self.template respondsToSelector:@selector(parameters)]) {
        [dictionary addEntriesFromDictionary:self.template.parameters];
    }
    return [dictionary copy];
}

- (NSString *)pageAsString {
    return [NSString stringWithFormat:@"%@", @(self.page)];
}

Finally, before we close out this class, we'll add one nice little touch. Because our pagination is now genericized, we can add nice things like this in one place and get the benefits of them everywhere. Without -requestForNextPage, objects using this class would have to ask what the current page is, and then construct the request themselves. This way, we're telling, not asking.

- (BAKPaginatableRequest *)requestForNextPage {
    return [[BAKPaginatableRequest alloc] initWithRequestTemplate:self.template page:self.page+1];
}

@end
June 30, 2015

Protocol-Oriented Networking

Crusty, the antagonistic character from WWDC 2015 Session 408, has been making the rounds on the blogosphere, writing about protocol-oriented programming. For me, protocol extensions in Swift are easily the coolest new feature, because they enable us to add behavior to a set of data easily.

In a post from a few weeks ago called Templating, I describe a flexible networking architecture, that relies heavily on protocols to define and send network requests. It's been working well in practice, and a few cases popped up where the architecture made it really easy to handle new unexpected request types. I hope to write about those soon.

For this post, I'd like to examine the new protocol extensions, and what they can do for this networking design. Ultimately, what we were doing in the Templating post was adding behavior to a set of data. Since I wrote it first in Objective-C, we did it with decoration.

With Swift 2, however, while we could continue to use decoration to wrap our data with new functionality, we've been given a new power with protocol extensions. Protocol extensions let us add concrete methods to a protocol that are dependent on the abstract methods in that protocol. It's a form of the template method pattern, but one that doesn't rely on inheritance.

I'm still very new at the Swift stuff, so you'll have to forgive my sins.

Let's define a lighter version of the SKRequestTemplate protocol from the previous post, but in Swift this time. Since we're adding sendRequest() directly onto the protocol, it's no longer a template, but the request itself.

protocol Request {
    var baseURL : NSURL? { get }
    var method : String { get }
    var path : String { get }
    var parameters : Dictionary<String, String> { get }
}

The baseURL property is marked as an optional, becuase -[NSURL URLWithString:] returns an optional by default. Since there's no good default for a URL (like the empty string is for strings), we'll prevent the user of our networking protocol from having to use a scary bang and allow her to return an optional here.

Okay, let's define our first request. We'll use Github's Zen endpoint, which just returns a short proverb as a string.

struct ZenRequest : Request {
    let baseURL = NSURL(string: "https://api.github.com/")
    let path: String = "zen"
}

Uh-oh, the compiler is already complaining. Swift wants to require us to return something for the method and parameters. That would make this request ugly, though, so we won't be doing that. We could mark each property with the optional keyword, but then we have to mark the whole protocol as @objc, and I want to make this as Swifty as possible. (Ash Furrow lays out this problem neatly in his post, Protocols and Swift.)

Fortunately, we're saved by protocol extensions here. We can give a default implementation for these properties in an extension, and we can leave it out of the individual request structs.

extension Request {
    var method : String { return "GET" }
    var path : String { return "" }
    var parameters : Dictionary<String, String> { return Dictionary() }
}

Now let's add our sendRequest function to the RequestTemplate:

extension Request {
    func buildRequest() -> NSURLRequest? {
        guard let baseURL = baseURL else { return nil }
        guard let URLComponents = NSURLComponents(URL: baseURL, resolvingAgainstBaseURL: true) else { return nil }
        URLComponents.path = (URLComponents.path ?? "") + path
        guard let URL = URLComponents.URL else { return nil }
        let request = NSMutableURLRequest(URL: URL)
        request.HTTPMethod = method
        return request
    }
    func sendRequest(success success: (string: String) -> (), failure: (error: ErrorType) -> ()) {
        let session = NSURLSession.sharedSession()
        guard let request = buildRequest() else { return }
        guard let task = session.dataTaskWithRequest(request, completionHandler: { (taskData, taskResponse, taskError) -> Void in
            if let taskError = taskError {
                failure (error: taskError)
            } else if let taskData = taskData {
                guard let string = NSString(data: taskData, encoding: NSUTF8StringEncoding) as? String else { return }
                success(string: string)
            }
        }) else { return }
        task.resume()
    }
}

Wow, look at all those guard let statements! Exclamation points are for your writing, not your code. We can now create and send our request:

ZenRequest().sendRequest(
    success: { string in
        print(string)
    },
    failure: { error in
        print(error)
})

This is a really simple version of templated requests (I left out a few complexities, like parameters, headers, JSON parsing, etc), but I certainly like how succinct it is. A GET request with no parameters is only a few lines of code. Supporting different types of requests, such as multi-part requests, means overriding the buildRequest method, and returning whatever cooked URL request is appropriate.