Cocoa Samurai: CocoaHeads

Showing posts with label CocoaHeads. Show all posts

Sunday, April 04, 2010

DM CocoaHeads: Good User Interface Design in Cocoa & Cocoa Touch

http://cocoaheads.org/us/DesMoinesIowa/index.html Thursday, April 8 at 7 pm at the Impromptu Studio in Des Moines,IA (see link above for Map & directions) I recently became the group leader of the Des Moines CocoaHeads, and to get the ball rolling I will be giving a presentation this week on Good User Interface Design in Cocoa & Touch. I thought this would be a good time to do this as several people are starting work on new projects. Some things I'll Cover...

Good & Bad User Interfaces and what makes them good and bad
How to plan out a User Interface (we'll go through an example test app)
UI Design patterns on the Mac and iPhone (and iPad to the extent possible)
Q&A/Discussion on UI Design patterns

If your coming feel free to bring your apps in and we can give your UI a good review and help you improve it. Sometime after this is over i'll take my presentation & post it online here via article or screencast, so keep tuned.

Wednesday, January 20, 2010

Understanding the Objective-C Runtime

Screen shot 2010-01-15 at 10.18.04 AM.png

The Objective-C Runtime is one of the overlooked features of Objective-C initially when people are generally introduced to Cocoa/Objective-C. The reason for this is that while Objective-C (the language) is easy to pick up in only a couple hours, newcomers to Cocoa spend most of their time wrapping their heads around the Cocoa Framework and adjusting to how it works. However the runtime is something that everybody should at least know how it works in some detail beyond knowing that code like [target doMethodWith:var1]; gets translated into objc_msgSend(target,@selector(doMethodWith:),var1); by the compiler. Knowing what the Objective-C runtime is doing will help you gain a much deeper understanding of Objective-C itself and how your app is run. I think Mac/iPhone Developers will gain something from this, regardless of your level of experience.

The Objective-C Runtime is Open Source
The Objective-C Runtime is open source and available anytime from http://opensource.apple.com. In fact examining the Objective-C is one of the first ways I went through to figure out how it worked, beyond reading Apples documentation on the matter. You can download the current version of the runtime (as of this writting) for Mac OS X 10.6.2 here objc4-437.1.tar.gz.

Dynamic vs Static Languages
Objective-C is a runtime oriented language, which means that when it's possible it defers decisions about what will actually be executed from compile & link time to when it's actually executing on the runtime. This gives you a lot of flexibility in that you can redirect messages to appropriate objects as you need to or you can even intentionally swap method implementations, etc. This requires the use of a runtime which can introspect objects to see what they do & don't respond to and dispatch methods appropriately. If we contrast this to a language like C. In C you start out with a main() method and then from there it's pretty much a top down design of following your logic and executing functions as you've written your code. A C struct can't forward requests to perform a function onto other targets. Pretty much you have a program like so

#include < stdio.h >
 
int main(int argc, const char **argv[])
{
        printf("Hello World!");
        return 0;
}

which a compiler parses, optimizes and then transforms your optimized code into assembly

.text
 .align 4,0x90
 .globl _main
_main:
Leh_func_begin1:
 pushq %rbp
Llabel1:
 movq %rsp, %rbp
Llabel2:
 subq $16, %rsp
Llabel3:
 movq %rsi, %rax
 movl %edi, %ecx
 movl %ecx, -8(%rbp)
 movq %rax, -16(%rbp)
 xorb %al, %al
 leaq LC(%rip), %rcx
 movq %rcx, %rdi
 call _printf
 movl $0, -4(%rbp)
 movl -4(%rbp), %eax
 addq $16, %rsp
 popq %rbp
 ret
Leh_func_end1:
 .cstring
LC:
 .asciz "Hello World!"

and then links it together with a library and produces a executable. This contrasts from Objective-C in that while the process is similar the code that the compiler generates depends on the presence of the Objective-C Runtime Library. When we are all initially introduced to Objective-C we are told that (at a simplistic level) what happens to our Objective-C bracket code is something like...

[self doSomethingWithVar:var1];

gets translated to...

objc_msgSend(self,@selector(doSomethingWithVar:),var1);

but beyond this we don't really know much till much later on what the runtime is doing.

What is the Objective-C Runtime?
The Objective-C Runtime is a Runtime Library, it's a library written mainly in C & Assembler that adds the Object Oriented capabilities to C to create Objective-C. This means it loads in Class information, does all method dispatching, method forwarding, etc. The Objective-C runtime essentially creates all the support structures that make Object Oriented Programming with Objective-C Possible.

Objective-C Runtime Terminology
So before we go on much further, let's get some terminology out of the way so we are all on the same page about everything. 2 Runtimes As far as Mac & iPhone Developers are concerned there are 2 runtimes: The Modern Runtime & the Legacy Runtime Modern Runtime: Covers all 64 bit Mac OS X Apps & all iPhone OS Apps Legacy Runtime: Covers everything else (all 32 bit Mac OS X Apps) Method There are 2 basic types of methods. Instance Methods (begin with a '-' like -(void)doFoo; that operate on Object Instances. And Class Methods (begin with a '+' like + (id)alloc. Methods are just like C Functions in that they are a grouping of code that performs a small task like

-(NSString *)movieTitle
{
    return @"Futurama: Into the Wild Green Yonder";
}

Selector A selector in Objective-C is essentially a C data struct that serves as a mean to identify an Objective-C method you want an object to perform. In the runtime it's defined like so...

typedef struct objc_selector  *SEL;

and used like so...

SEL aSel = @selector(movieTitle);

Message

[target getMovieTitleForObject:obj];

An Objective-C Message is everything between the 2 brackets '[ ]' and consists of the target you are sending a message to, the method you want it to perform and any arguments you are sending it. A Objective-C message while similar to a C function call is different. The fact that you send a message to an object doesn't mean that it'll perform it. The Object could check who the sender of the message is and based on that decide to perform a different method or forward the message onto a different target object. Class If you look in the runtime for a class you'll come across this...

typedef struct objc_class *Class;
typedef struct objc_object {
    Class isa;
} *id;

Here there are several things going on. We have a struct for an Objective-C Class and a struct for an object. All the objc_object has is a class pointer defined as isa, this is what we mean by the term 'isa pointer'. This isa pointer is all the Objective-C Runtime needs to inspect an object and see what it's class is and then begin seeing if it responds to selectors when you are messaging objects. And lastly we see the id pointer. The id pointer by default tells us nothing about Objective-C objects except that they are Objective-C objects. When you have a id pointer you can then ask that object for it's class, see if it responds to a method, etc and then act more specifically when you know what the object is that you are pointing to. You can see this as well on Blocks in the LLVM/Clang docs

struct Block_literal_1 {
    void *isa; // initialized to &_NSConcreteStackBlock or &_NSConcreteGlobalBlock
    int flags;
    int reserved; 
    void (*invoke)(void *, ...);
    struct Block_descriptor_1 {
 unsigned long int reserved; // NULL
     unsigned long int size;  // sizeof(struct Block_literal_1)
 // optional helper functions
     void (*copy_helper)(void *dst, void *src);
     void (*dispose_helper)(void *src); 
    } *descriptor;
    // imported variables
};

Blocks themselves are designed to be compatible with the Objective-C runtime so they are treated as objects so they can respond to messages like -retain,-release,-copy,etc. IMP (Method Implementations)

typedef id (*IMP)(id self,SEL _cmd,...);

IMP's are function pointers to the method implementations that the compiler will generate for you. If your new to Objective-C you don't need to deal with these directly until much later on, but this is how the Objective-C runtime invokes your methods as we'll see soon. Objective-C Classes So what's in an Objectve-C Class? The basic implementation of a class in Objective-C looks like

@interface MyClass : NSObject {
//vars
NSInteger counter;
}
//methods
-(void)doFoo;
@end

but the runtime has more than that to keep track of

#if !__OBJC2__
    Class super_class                                        OBJC2_UNAVAILABLE;
    const char *name                                         OBJC2_UNAVAILABLE;
    long version                                             OBJC2_UNAVAILABLE;
    long info                                                OBJC2_UNAVAILABLE;
    long instance_size                                       OBJC2_UNAVAILABLE;
    struct objc_ivar_list *ivars                             OBJC2_UNAVAILABLE;
    struct objc_method_list **methodLists                    OBJC2_UNAVAILABLE;
    struct objc_cache *cache                                 OBJC2_UNAVAILABLE;
    struct objc_protocol_list *protocols                     OBJC2_UNAVAILABLE;
#endif

We can see a class has a reference to it's superclass, it's name, instance variables, methods, cache and protocols it claims to adhere to. The runtime needs this information when responding to messages that message your class or it's instances.

So Classes define objects and yet are objects themselves? How does this work
Yes earlier I said that in objective-c classes themselves are objects as well, and the runtime deals with this by creating Meta Classes. When you send a message like [NSObject alloc] you are actually sending a message to the class object, and that class object needs to be an instance of the MetaClass which itself is an instance of the root meta class. While if you say subclass from NSObject, your class points to NSObject as it's superclass. However all meta classes point to the root metaclass as their superclass. All meta classes simply have the class methods for their method list of messages that they respond to. So when you send a message to a class object like [NSObject alloc] then objc_msgSend() actually looks through the meta class to see what it responds to then if it finds a method, operates on the Class object.

Why we subclass from Apples Classes
So initially when you start Cocoa development, tutorials all say to do things like subclass NSObject and start then coding something and you enjoy a lot of benefits simply by inheriting from Apples Classes. One thing you don't even realize that happens for you is setting your objects up to work with the Objective-C runtime. When we allocate an instance of one of our classes it's done like so...

MyObject *object = [[MyObject alloc] init];

the very first message that gets executed is +alloc. If you look at the documentation it says that "The isa instance variable of the new instance is initialized to a data structure that describes the class; memory for all other instance variables is set to 0." So by inheriting from Apples classes we not only inherit some great attributes, but we inherit the ability to easily allocate and create our objects in memory that matches a structure the runtime expects (with a isa pointer that points to our class) & is the size of our class.

So what's with the Class Cache? ( objc_cache *cache )
When the Objective-C runtime inspects an object by following it's isa pointer it can find an object that implements many methods. However you may only call a small portion of them and it makes no sense to search the classes dispatch table for all the selectors every time it does a lookup. So the class implements a cache, whenever you search through a classes dispatch table and find the corresponding selector it puts that into it's cache. So when objc_msgSend() looks through a class for a selector it searches through the class cache first. This operates on the theory that if you call a message on a class once, you are likely to call that same message on it again later. So if we take this into account this means that if we have a subclass of NSObject called MyObject and run the following code

MyObject *obj = [[MyObject alloc] init];
 
@implementation MyObject
-(id)init {
    if(self = [super init]){
        [self setVarA:@”blah”];
    }
    return self;
}
@end

the following happens (1) [MyObject alloc] gets executed first. MyObject class doesn't implement alloc so we will fail to find +alloc in the class and follow the superclass pointer which points to NSObject (2) We ask NSObject if it responds to +alloc and it does. +alloc checks the receiver class which is MyObject and allocates a block of memory the size of our class and initializes it's isa pointer to the MyObject class and we now have an instance and lastly we put +alloc in NSObject's class cache for the class object (3) Up till now we were sending a class messages but now we send an instance message which simply calls -init or our designated initializer. Of course our class responds to that message so -(id)init get's put into the cache (4) Then self = [super init] gets called. Super being a magic keyword that points to the objects superclass so we go to NSObject and call it's init method. This is done to insure that OOP Inheritance works correctly in that all your super classes will initialize their variables correctly and then you (being in the subclass) can initialize your variables correctly and then override the superclasses if you really need to. In the case of NSObject, nothing of huge importance goes on, but that is not always the case. Sometimes important initialization happens. Take this...

#import < Foundation/Foundation.h>
 
@interface MyObject : NSObject
{
 NSString *aString;
}
 
@property(retain) NSString *aString;
 
@end
 
@implementation MyObject
 
-(id)init
{
 if (self = [super init]) {
  [self setAString:nil];
 }
 return self;
}
 
@synthesize aString;
 
@end
 
 
 
int main (int argc, const char * argv[]) {
    NSAutoreleasePool * pool = [[NSAutoreleasePool alloc] init];
 
 id obj1 = [NSMutableArray alloc];
 id obj2 = [[NSMutableArray alloc] init];
  
 id obj3 = [NSArray alloc];
 id obj4 = [[NSArray alloc] initWithObjects:@"Hello",nil];
  
 NSLog(@"obj1 class is %@",NSStringFromClass([obj1 class]));
 NSLog(@"obj2 class is %@",NSStringFromClass([obj2 class]));
  
 NSLog(@"obj3 class is %@",NSStringFromClass([obj3 class]));
 NSLog(@"obj4 class is %@",NSStringFromClass([obj4 class]));
  
 id obj5 = [MyObject alloc];
 id obj6 = [[MyObject alloc] init];
  
 NSLog(@"obj5 class is %@",NSStringFromClass([obj5 class]));
 NSLog(@"obj6 class is %@",NSStringFromClass([obj6 class]));
  
 [pool drain];
    return 0;
}

Now if you were new to Cocoa and I asked you to guess as to what would be printed you'd probably say

NSMutableArray
NSMutableArray 
NSArray
NSArray
MyObject
MyObject

but this is what happens

obj1 class is __NSPlaceholderArray
obj2 class is NSCFArray
obj3 class is __NSPlaceholderArray
obj4 class is NSCFArray
obj5 class is MyObject
obj6 class is MyObject

This is because in Objective-C there is a potential for +alloc to return an object of one class and then -init to return an object of another class.

So what happens in objc_msgSend anyway?
There is actually a lot that happens in objc_msgSend(). Lets say we have code like this...

[self printMessageWithString:@"Hello World!"];

it actually get's translated by the compiler to...

objc_msgSend(self,@selector(printMessageWithString:),@"Hello World!");

From there we follow the target objects isa pointer to lookup and see if the object (or any of it's superclasses) respond to the selector @selector(printMessageWithString:). Assuming we find the selector in the class dispatch table or it's cache we follow the function pointer and execute it. Thus objc_msgSend() never returns, it begins executing and then follows a pointer to your methods and then your methods return, thus looking like objc_msgSend() returned. Bill Bumgarner went into much more detail ( Part 1, Part 2 & Part 3) on objc_msgSend() than I will here. But to summarize what he said and what you'd see looking at the Objective-C runtime code... 1. Checks for Ignored Selectors & Short Circut - Obviously if we are running under garbage collection we can ignore calls to -retain,-release, etc 2. Check for nil target. Unlike other languages messaging nil in Objective-C is perfectly legal & there are some valid reasons you'd want to. Assuming we have a non nil target we go on... 3. Then we need to find the IMP on the class, so we first search the class cache for it, if found then follow the pointer and jump to the function 4. If the IMP isn't found in the cache then the class dispatch table is searched next, if it's found there follow the pointer and jump to the pointer 5. If the IMP isn't found in the cache or class dispatch table then we jump to the forwarding mechanism This means in the end your code is transformed by the compiler into C functions. So a method you write like say...

-(int)doComputeWithNum:(int)aNum

would be transformed into...

int aClass_doComputeWithNum(aClass *self,SEL _cmd,int aNum)

And the Objective-C Runtime calls your methods by invoking function pointers to those methods. Now I said that you cannot call those translated methods directly, however the Cocoa Framework does provide a method to get at the pointer...

//declare C function pointer
int (computeNum *)(id,SEL,int);
 
//methodForSelector is COCOA & not ObjC Runtime
//gets the same function pointer objc_msgSend gets
computeNum = (int (*)(id,SEL,int))[target methodForSelector:@selector(doComputeWithNum:)];
 
//execute the C function pointer returned by the runtime
computeNum(obj,@selector(doComputeWithNum:),aNum);

In this way you can get direct access to the function and directly invoke it at runtime and even use this to circumvent the dynamism of the runtime if you absolutely need to make sure that a specific method is executed. This is the same way the Objective-C Runtime invokes your method, but using objc_msgSend().

Objective-C Message Forwarding
In Objective-C it's very legal (and may even be an intentional design decision) to send messages to objects to which they don't know how to respond to. One reason Apple gives for this in their docs is to simulate multiple inheritance which Objective-C doesn't natively support, or you may just want to abstract your design and hide another object/class behind the scenes that deals with the message. This is one thing that the runtime is very necessary for. It works like so 1. The Runtime searches through the class cache and class dispatch table of your class and all the super classes, but fails to to find the specified method 2. The Objective-C Runtime will call + (BOOL) resolveInstanceMethod:(SEL)aSEL on your class. This gives you a chance to provide a method implementation and tell the runtime that you've resolved this method and if it should begin to do it's search it'll find the method now. You could accomplish this like so... define a function...

void fooMethod(id obj, SEL _cmd)
{
 NSLog(@"Doing Foo");
}

you could then resolve it like so using class_addMethod()...

+(BOOL)resolveInstanceMethod:(SEL)aSEL
{
    if(aSEL == @selector(doFoo:)){
        class_addMethod([self class],aSEL,(IMP)fooMethod,"v@:");
        return YES;
    }
    return [super resolveInstanceMethod];
}

The "v@:" in the last part of class_addMethod() is what the method is returning and it's arguments. You can see what you can put there in the Type Encodings section of the Runtime Guide. 3. The Runtime then calls - (id)forwardingTargetForSelector:(SEL)aSelector. What this does is give you a chance (since we couldn't resolve the method (see #2 above)) to point the Objective-C runtime at another object which should respond to the message, also this is better to do before the more expensive process of invoking - (void)forwardInvocation:(NSInvocation *)anInvocation takes over. You could implement it like so

- (id)forwardingTargetForSelector:(SEL)aSelector
{
    if(aSelector == @selector(mysteriousMethod:)){
        return alternateObject;
    }
    return [super forwardingTargetForSelector:aSelector];
}

Obviously you don't want to ever return self from this method or it could result in an infinite loop. 4. The Runtime then tries one last time to get a message sent to it's intended target and calls - (void)forwardInvocation:(NSInvocation *)anInvocation. If you've never seen NSInvocation, it's essentially an Objective-C Message in object form. Once you have an NSInvocation you essentially can change anything about the message including it's target, selector & arguments. So you could do...

-(void)forwardInvocation:(NSInvocation *)invocation
{
    SEL invSEL = invocation.selector;
 
    if([altObject respondsToSelector:invSEL]) {
        [invocation invokeWithTarget:altObject];
    } else {
        [self doesNotRecognizeSelector:invSEL];
    }
}

by default if you inherit from NSObject it's - (void)forwardInvocation:(NSInvocation *)anInvocation implementation simply calls -doesNotRecognizeSelector: which you could override if you wanted to for one last chance to do something about it.

Non Fragile ivars (Modern Runtime)
One of the things we recently gained in the modern runtime is the concept of Non Fragile ivars. When compiling your classes a ivar layout is made by the compiler that shows where to access your ivars in your classes, this is the low level detail of getting a pointer to your object, seeing where the ivar is offset in relation to the beginning of the bytes the object points at, and reading in the amount of bytes that is the size of the type of variable you are reading in. So your ivar layout may look like this, with the number in the left column being the byte offset.

Here we have the ivar layout for NSObject and then we subclass NSObject to extend it and add on our own ivars. This works fine until Apple ships a update or all new Mac OS X 10.x release and this happens

Your custom objects get wiped out because we have an overlapping superclass. The only alternative that could prevent this is if Apple sticked with the layout it had before, but if they did that then their Frameworks could never advance because their ivar layouts were frozen in stone. Under fragile ivars you have to recompile your classes that inherit from Apples classes to restore compatibility. So what Happens under non fragile ivars?

Under Non Fragile ivars the compiler generates the same ivar layout as under fragile ivars. However when the runtime detects an overlapping superclass it adjusts the offsets to your additions to the class, thus your additions in a subclass are preserved.

Objective-C Associated Objects
One thing recently introduced in Mac OS X 10.6 Snow Leopard was called Associated References. Objective-C has no support for dynamically adding on variables to objects unlike some other languages that have native support for this. So up until now you would have had to go to great lengths to build the infrastructure to pretend that you are adding a variable onto a class. Now in Mac OS X 10.6, the Objective-C Runtime has native support for this. If we wanted to add a variable to every class that already exists like say NSView we could do so like this...

#import < Cocoa/Cocoa.h> //Cocoa
#include < objc/runtime.h> //objc runtime api’s
 
@interface NSView (CustomAdditions)
@property(retain) NSImage *customImage;
@end
 
@implementation NSView (CustomAdditions)
 
static char img_key; //has a unique address (identifier)
 
-(NSImage *)customImage
{
    return objc_getAssociatedObject(self,&img_key);
}
 
-(void)setCustomImage:(NSImage *)image
{
    objc_setAssociatedObject(self,&img_key,image,
                             OBJC_ASSOCIATION_RETAIN);
}
 
@end

you can see in runtime.h the options for how to store the values passed to objc_setAssociatedObject().

/* Associated Object support. */
 
/* objc_setAssociatedObject() options */
enum {
    OBJC_ASSOCIATION_ASSIGN = 0,
    OBJC_ASSOCIATION_RETAIN_NONATOMIC = 1,
    OBJC_ASSOCIATION_COPY_NONATOMIC = 3,
    OBJC_ASSOCIATION_RETAIN = 01401,
    OBJC_ASSOCIATION_COPY = 01403
};

These match up with the options you can pass in the @property syntax.

Hybrid vTable Dispatch
If you look through the modern runtime code you'll come across this (in objc-runtime-new.m)...

/***********************************************************************
* vtable dispatch
* 
* Every class gets a vtable pointer. The vtable is an array of IMPs.
* The selectors represented in the vtable are the same for all classes
*   (i.e. no class has a bigger or smaller vtable).
* Each vtable index has an associated trampoline which dispatches to 
*   the IMP at that index for the receiver class's vtable (after 
*   checking for NULL). Dispatch fixup uses these trampolines instead 
*   of objc_msgSend.
* Fragility: The vtable size and list of selectors is chosen at launch 
*   time. No compiler-generated code depends on any particular vtable 
*   configuration, or even the use of vtable dispatch at all.
* Memory size: If a class's vtable is identical to its superclass's 
*   (i.e. the class overrides none of the vtable selectors), then 
*   the class points directly to its superclass's vtable. This means 
*   selectors to be included in the vtable should be chosen so they are 
*   (1) frequently called, but (2) not too frequently overridden. In 
*   particular, -dealloc is a bad choice.
* Forwarding: If a class doesn't implement some vtable selector, that 
*   selector's IMP is set to objc_msgSend in that class's vtable.
* +initialize: Each class keeps the default vtable (which always 
*   redirects to objc_msgSend) until its +initialize is completed.
*   Otherwise, the first message to a class could be a vtable dispatch, 
*   and the vtable trampoline doesn't include +initialize checking.
* Changes: Categories, addMethod, and setImplementation all force vtable 
*   reconstruction for the class and all of its subclasses, if the 
*   vtable selectors are affected.
**********************************************************************/

The idea behind this is that the runtime is trying to store in this vtable the most called selectors so this in turn speeds up your app because it uses fewer instructions than objc_msgSend. This vtable is the 16 most called selectors which make up an overwheling majority of all the selectors called globally, in fact further down in the code you can see the default selectors for Garbage Collected & non Garbage Collected apps...

static const char * const defaultVtable[] = {
    "allocWithZone:", 
    "alloc", 
    "class", 
    "self", 
    "isKindOfClass:", 
    "respondsToSelector:", 
    "isFlipped", 
    "length", 
    "objectForKey:", 
    "count", 
    "objectAtIndex:", 
    "isEqualToString:", 
    "isEqual:", 
    "retain", 
    "release", 
    "autorelease", 
};
static const char * const defaultVtableGC[] = {
    "allocWithZone:", 
    "alloc", 
    "class", 
    "self", 
    "isKindOfClass:", 
    "respondsToSelector:", 
    "isFlipped", 
    "length", 
    "objectForKey:", 
    "count", 
    "objectAtIndex:", 
    "isEqualToString:", 
    "isEqual:", 
    "hash", 
    "addObject:", 
    "countByEnumeratingWithState:objects:count:", 
};

So how will you know if your dealing with it? You'll see one of several methods called in your stack traces while your debugging. All of these you should basically treat just like they are objc_msgSend() for debugging purposes... objc_msgSend_fixup happens when the runtime is assigning one of these methods that your calling a slot in the vtable. objc_msgSend_fixedup occurs when your calling one of these methods that was supposed to be in the vtable but is no longer in there objc_msgSend_vtable[0-15] you'll might see a call to something like objc_msgSend_vtable5 this means you are calling one of these common methods in the vtable. The runtime can assign and unassign these as it wants to, so you shouldn't count on the fact that objc_msgSend_vtable10 corresponds to -length on one run means it'll ever be there on any of your next runs.

Conclusion
I hope you liked this, this article essentially makes up the content I covered in my Objective-C Runtime talk to the Des Moines Cocoaheads (a lot to pack in for as long a talk as we had.) The Objective-C Runtime is a great piece of work, it does a lot powering our Cocoa/Objective-C apps and makes possible so many features we just take for granted. Hope I hope if you haven't yet you'll take a look through these docs Apple has that show how you can take advantage of the Objective-C Runtime. Thanks! Objective-C Runtime Programming Guide Objective-C Runtime Reference

Sunday, April 13, 2008

A Guide to Threading on Leopard

Authors Note: This is a campanion article to the talk I gave @ CocoaHeads on Thursday April 10, 2008. You can download the a copy of the talks from http://www.1729.us/cocoasamurai/Leopard%20Threads.pdf.

Intro to Threading

It's becoming abundantly clear that one big way you can increase application performance is multithreading, this is because increasing processor speeds are no longer a viable route for increasing application performance, although it does help. Multithreading is splitting up your application into multiple threads that execute concurrently with some threads having access to the same data structures that your main thread has access to.

A benefit to this technique is that its possible for your application to have a whole core to itself or that your app's main thread can reside on one core and a thread you spawn off can be spawned onto your 2nd core or 3rd,etc.

Why is this becoming an Issue?

This is becoming more and more of an issue these days because all the Mac's Apple sells are at least Dual Core machines with Intel Core 2 Duo's or 2x Quad Core Intel Xeon on the Mac Pro's. In other words Apple doesn't sell any single core Mac's (leaving out the iPhone in this instance) anymore, so now we have all this computational wealth to take advantage of, it's foolish not to take advantage of multithreading if it can benefit your application.

Why & When you Should & Shouldn't Thread

Now with all this talk I suppose it may be uplifting you to get on this bandwagon and put a bunch of threads in your app right now. But I should give you some info on why you should and shouldn't thread. Multithreading is not a tool that should be used because you can. One reason it's coming into use more and more is that when your app starts out it is given 1 thread and a Run Loop that receives events. If you are performing an incredibly long calculation on that thread it freezes your user interface leaving your users incredibly frustrated that they cannot do anything while this calculation is going on. Because of this multithreading became a topic of interest beyond just the operating system designers and became an issue for us as App designers. The solution was to spawn these long operations onto their own threads that run at the same time in the background while your user interface is still responsive to the users interactions. So what are some good candidates for threading operations?

File / IO / Networking Operations
API's that explicitly state they will lock the calling thread and create their own thread
Any compartmental/modular task that's at least 10ms

10 milliseconds is a bit ambiguous though. Does 10ms mean 10ms on a Core 2 Duo MacBook Pro with 4GB RAM or 10ms on a Intel Xeon with 16GB RAM or what? This 10ms time suggestion that's repeated many times over in Apples Documentation, the only suggestion I can give you is that if you gage that your app is taking at least 10ms on your machine at least it's a good candidate for threading assuming your machine is a Mac that your target audience might be using. In the end you need to do testing to see if your methods are taking at least 10ms. If you need to you can use Instruments or Shark to gage your applications performance. I made a quick screencast below showing how to do so with Instruments

If you want to download this all you need is a free Viddler Account and download the original video from http://www.viddler.com/explore/Machx/videos/2/ and go to the download tab and you can see the original upload file and download it. This is my very first screencast so if you have any comments/suggestions feel free to let me know. When shouldn't you thread then?

If many threads are trying to access/modify shared data structures
If the task doesn't take at least 10ms
If the end result is that you app is taking unnecessary memory due to threading
If threads slow down your app more than it speeds it up

If your threads are trying to access a shared data structure isn't a no no when it comes to threads, because idealistically you should never spawn threads that do this, however in the real world this isn't always possible. When you spawn threads that are all vying for access to a shared data structure without locks you get inconsistent data across threads, which can leave your app acting oddly or even crash if your app depends on certain states in the data structure. However if you are a good programmer and provide locks in your application, it can lead to lock contention where you are putting locks in all the proper places, but because there are many threads trying to access the same thing it leads to a backlog of threads waiting to get access to the same data structure. Apple repeatedly quotes the 10ms rule (see discussion above) and is their suggestion that if your under 10ms it's not worth spawning a thread for the particular task. It's also important to remember that every time you spawn a thread, memory is allocated to in terms of Kernel Data Structures and Stack Space for allocating variables on the thread. If you are spawning a lot of threads your threads are consuming memory on your system and could end up slowing it down more than it really benefits it too.

Thread Costs

Every time you create thread it's important to note that there is an inherent cost to that creation in terms of CPU Time and Memory allocation. According to Apples Documentation Thread creation has the following costs as measured on a iMac Core 2 Duo with 1GB RAM:

Kernel Data Structures	1 KB
Stack Space	512 KB 2nd*/8MB Main
Creation Time	90 microseconds
Mutex Acquisition Time	0.2 microseconds
Atomic Compare & Swap	0.05 microseconds

* = 16 KB Stack Minimum and the stack space has to be a multiple of 4 KB Thus if your spawning many threads in your application you are consuming a great deal of resources and not only that if you spawn way too many threads your threads could possibly be fighting for CPU resources.

Warning!

It's also important to note that although threads are a tool just like anything else in the Cocoa/Foundation frameworks, if you misuse threads and screw up with them you could really screw up bad with them. My advice if anything is that you should never use threads because you simply can use them, but rather you use them because you've gaged the performance of your application and looked at several methods and decided that they could benefit from the concurrency that it offers. That means going into Shark and/or instruments and developing some performance metric for your application and watching how it does threaded/non-threaded. My last piece of advice is that you pay attention to what is and isn't thread safe in Apples Documentation. In the Multithreaded Programming Guide Apple does list off some of the classes that are and aren't thread safe. A general rule for thread safety is that mutable objects are generally not thread safe while immutable objects are generally thread safe. If Apples Documentation on the class you are using doesn't make an explicit reference to the class being thread safe, you should assume that it is not thread safe.

How Threads are Implemented on Mac OS X

Mach Threads
At the lowest level on Mac OS X Threads are actually implemented as Mach Threads, however you probably won't ever be dealing with them directly. According to the Tech Note 2028 User Space programs should not create mach threads directly. If you want to know more about Mach Threads I would suggest buying Amit Singh's Mac OS X Internals Book and reading up on them there, where he describes them in great detail. POSIX Threads
POSIX Threads are the next type of threads and these threads are layered on top of Mach Threads. POSIX Stands for Portable Operating System Interface and is a common set of API's for performing many tasks, amongst them creating and managing threads. You can use threads in your application simply by including the pthread.h header file in your class and you have access to the POSIX API's. This is the level you will probably go down to if you need finer grain control over your threads than the higher level API's offer. High Level API's ( NSObject / NSThread / NSOperation )
Cocoa offers an array of Threading API's in Leopard that make it convenient and easy to thread. NSObject
Starting with Mac OS X 10.5 Leopard NSObject gained a new API called - (void)performSelectorInBackground:(SEL)aSelector withObject:(id)arg. This method makes it easy to dispatch a new thread with the selector and arguments provided. It's interesting to note that this is essentially the same as NSThreads Class method + (void)detachNewThreadSelector:(SEL)aSelector toTarget:(id)aTarget withObject:(id)anArgument with the benefit being that with the NSObject method you no longer have to specify a target, instead you are calling the method on the intended target. When you call the method performSelectorInBackground: withObject: you are in essence spawning off a new thread with the class of the object that immediately goes to execute the selector specified. So take this example:

[CWObject performSelectorInBackground:@selector(threadMethod:) withObject:nil];

is essentially the same as spawning a new thread with the class CWObject that immediately starts executing the method threadMethod:. This is a convenience method and puts your application into multithreaded mode. NSThread
If you've ever dealt with threading pre-Leopard chances are that you probably came in contact with NSThreads Class Method detachNewThreadSelector: toTarget: withObject:. However it's gotten a couple new tricks. For starters you can now instantiate NSThread objects and start them when you want to and you can create your own NSThread subclasses and override it's - (void)main method, which makes it look similar to NSOperation, minus many of the benefits NSOperation brings to threading operations including KVO compliance. NSOperation / NSOperationQueue
NSOperation is the sexy new kid on the block and brings a lot to the table when it comes to threading. It's an abstract class that you subclass and override it's - (void)main method just like NSThread, and that is all that is required from thereon out to instantiate your subclass and put it into action. Additionally NSOperation is fully KVO compliant. NSOperationQueue brings easy thread management to the table. It can poll your application performance, system performance/load and automatically spawn as many threads as it thinks your system can handle, making it future proof with regard to new hardware coming down the line from Apple, though I should note that in the documentation it says that it will most likely spawn off as many threads as your system has cores.

Thread Locks

I mentioned Thread Safety earlier on and a important part of that is that you make sure at critical points in your code that only 1 thread has access to a portion of code at a time to make sure that the value the thread retrieves is still correct or that 2 threads aren't changing the same value at the same time. There are several methods you can use to put locks on in your application. @synchronized Objective-C Directive Objective-C has a built in mechanism for thread locking. The @synchronized directive will take any Objective-C object including self. Whatever you pass to the @synchronized should be a unique identifier and will block other threads from accessing the block of code in between the brackets after the @synchronized code. In addition to providing a thread blocking mechanism, it also provides for Objective-C exception handling which means exception handling has to be enabled in your application in order to use it. When exceptions are thrown it releases the lock and throws the exception to the next exception handler. Some examples of how to use @synchronized are below:

- (void)veryCriticalThreadMethod
{
    @synchronized(self)
    {
        /* very critical code here */
    }
}

Apples doc's also make reference to a way you can just identify and lock your method down

- (void)veryCriticalThreadMethod
{
    @synchronized(NSStringFromSelector(_cmd))
    {
        /* very critical code here */
    }
}

In this case we are passing a string to the @synchronized directive which uses NSStringFromSelector method and passes in _cmd , which is just a reference to the methods own selector. Additionally if you are trying to ensure that only 1 thread can change the value of an object at a time you should just simply pass in that object there. NSLocks The @synchronized might be the most convenient method to implement thread locking, however it is not the only method by far, and in some situations is not the right one to apply. Here are some of the other methods for thread locking

NSLock NSLock is the most basic form of locking, it uses POSIX Thread Locking (link above) to implement it's locking.

NSLock *myLock = [[NSLock alloc] init];

[myLock lock];

/* critical code here */

[myLock unlock];

To use it in it's basic form all you need to do is to simply create an instance of a NSLock and call lock on it, then put your code after the lock call and when you are ready call unlock on it. I should note that if you have a situation where you try to call lock on it again you will get a deadlock because the lock is already locked and can't lock again. This brings me to NSRecursiveLock NSRecursiveLock NSRecursiveLock is just like NSLock in that it will provide locking for your thread, but it will allow you to call lock on it again if you are using the lock in a recursive manner like its name suggests.

NSRecursiveLock *myLock = [[NSRecursiveLock alloc] init];

-(void)veryCriticalThreadMethod
{
    [myLock lock];

    [self incrementCounterBy:2];

    [myLock unlock];
}

-(void)incrementCounterBy:(NSInteger)aNum
{
    [myLock lock];
    
    self.counter += aNum;
    
    [myLock unlock]
}

NSConditionLock NSConditionLock does as it's name suggests and gives you a set of API's for developing your own system as to when it should and shouldn't lock. It's conditions are all NSIntegers you pass in along with possible date references to lock down your thread. POSIX Thread Locking It is possible to use POSIX Thread Lock methods in your code since your thread is residing on top of a POSIX thread anyway. To use a POSIX Lock in your code you need to include the pthread.h header and use it like such

#include <pthread.h>

static pthread_mutex_t mtx = PTHREAD_MUTEX_INITIALIZER;

if(pthread_mutex_lock(&mtx))
    exit(-1); /* your lock has failed */

/* super vital secret code here */

if(pthread_mutex_unlock(&mtx))
    exit(-1); /* your unlock has failed */

It should be noted that the Google Mac team has blogged about this. They particularly found the @synchronized directives setup to be bloated for what they needed, because it not only locked the thread, but the exception handling setup was causing their app to not perform as well as they wanted. Their solution in the end was to use the pthread lock directly. Proper Locking with KVO Notifications As noted by Chris Kane on the Cocoa-Dev mailing list, KVO is thread safe but the manner you use the willChangeValueForKey and didChangeValueForKey can screw things up if you use it wrong. Thus if you use KVO notifications you should apply your lock send the willChangeValueForKey notification, modify your object, and then send the didChangeValueForKey notification like this:

NSLock *myLock;

[myLock lock];

[self willChangeValueForKey:@”someKey”];

someKey = somethingElse;

[self didChangeValueForKey:@”someKey”];

[myLock unlock];

He also discusses a receptionist pattern that can be used to perform KVO updates on the main thread from a 2nd+ thread. Locking with Threaded Drawing Another topic of interest for people is how you get threads to be able to draw to an NSView. Although I won't dive too deep into here the basic operations you need to follow when doing threaded drawing involve locking the NSView, performing your drawing and then unlocking the view like so

/* important that we acquire a lock for the view first */
if([ourView lockFocusIfCanDraw]) {
    
    /* do drawing code here */
        
    /* flushes the contents of the offscreen buffer to the screen if flushing & buffered is enabled */
    [[drawView window] flushWindow]; 
    /* relinquish our lock on the view */
    [ourView unlockFocus];
}

NSObject In Leopard NSObject gained a method called performSelectorInBackground: withObject, which I mentioned earlier is essentially the same as NSThreads class method for dispatching new threads minus the argument to which target object you want doing the operation.

/* in AppController.m */

[self performSelectorInBackground:@selector(doThreadingMethod:)
                       withObject:nil];
                    
/* is roughly equivalent to */

[NSThread detachNewThreadSelector:@selector(doThreadingMethod:)
                         toTarget:self
                        withObject:nil];

NSThread If you've ever been doing any threading, or looked at any Cocoa's project that did threading prior to Leopard you probably ran into NSThread and it's class method + (void)detachNewThreadSelector:(SEL)aSelector toTarget:(id)aTarget withObject:(id)anArgument which is shown in the source code example just before this. NSThread is a easy way to get a thread going. However it's got a new ability in leopard to create instances of NSThread and later on when you want to call start on it and then it will create your thread.

NSThread *myThread;

myThread = [[NSThread alloc] initWithTarget:self
                 selector:@selector(doSomething:)
                   object:nil];

/* some code */

[myThread start]; //start thread

All of this can be more convenient to you depending on how you like to use NSThread. However an ever better new ability in my opinion is your ability to subclass NSThread and override it's -(void)main method like so:

@interface MyThread : NSThread {
    NSString *aString;
}
@property(copy) NSString *aString;
@end

@implementation MyThread

@synthesize aString;

- (void)main
{
    self.aString = @"Super Big String that needed a thread";
}

@end

Just like as is the case with a generic NSThread object you simply create an instance of your NSThread subclass and call start on it when you need it to dispatch a new thread. It's also very important to Note that whatever method you call with the NSThread class method setup a NSAutoReleasePool and free it at the end of the method for non garbage collected applications, if your application is garbage collected you don't have to worry about this as this is automatically done for you. NSOperation I've made it no secret that I have a particular fondness of NSOperation and have been evangelizing it for a while now and for good reason. NSOperation brings the sexy to threading and makes it easy to not only create them, but with KVO and NSOperationQueue take a great deal of control of your threading operations. In particular I like this trend Apple is evangelizing of breaking your code up into specialized chunks that in combination with your custom initializer methods can be an effective force inside your app. In particular now I use a subclass in my app RedFlag (a Del.icio.us client in development) called RFNetworkThread all I have to assume is that a credential has been stored in the users keychain beforehand and I dispatch these threads with different URL's with call outs to del.icio.us and they can each store and retrieve the data for a different particular query easily. NSOperation is no different in how it's implemented in relation to subclassed NSThread objects with the exception of specifying that you are subclassing from NSOperation instead:

@interface MyThread : NSOperation {
    NSString *aString;
}
@property(copy) NSString *aString;
@end

@implementation MyThread

@synthesize aString;

- (void)main
{
    self.aString = @"Super Big String that needed a thread";
}

@end

However with NSOperation you gain some big benefits, most notably among them that you can use NSOperation with NSOperationQueue to have some sort of flow of control, NSOperation is also KVO compliant so that you can receive notifications about when it is executing and done executing. It's bindable properties are:

isCancelled
isConcurrent
isExecuting
isFinished
isReady
the Operations dependencies array
quePriority (only writable property)

Apple recommends that if you override any of these priorities that you maintain KVO compliance in your Operation objects, additionally if you extend any further properties on your operation objects you should make them KVO compliant as well. If you want to observe when your operation objects have completed execution, you could do it in a manner like this:

    1 NSOperationQueue *que = [[NSOperationQueue alloc] init];
    2     
    3 loginThread = [[RFNetworkThread alloc] init];
    4     
    5 [loginThread addObserver:self
    6               forKeyPath:@"isFinished" 
    7                options:0
    8                   context:nil];
    9     
   10 [que addOperation:loginThread];

Heres what we did (#1 Line 1) Created a NSOperationQueue to spawn our network thread on. (2: Line 3) Created our NSOperation Subclass object (3: Line 5) add the controller class as an observer of the operation objects keyPath "isFinished." Because isFinished will always be NO till it actually completes and changes to YES we just only need to be notified to when that keyPath is changed and then (4: Line 10) Add the operation object to the Queue. It's important to note that if you want to modify anything about a particular NSOperation object you need to do it before placing it onto the queue, otherwise it's just like playing russian roulette... you just don't know what will happen. Now to observe the change in the operation object you only need to implement standard KVO notification in whatever class added itself as an observer to the operation object like so:

    1 - (void)observeValueForKeyPath:(NSString *)keyPath 
    2                            ofObject:(id)object 
    3                               change:(NSDictionary *)change 
    4                             context:(void *)context
    5 {
    6 if([keyPath isEqual:@"isFinished"] && loginThread == object){
    7         NSLog(@"Our Thread Finished!");
    8         
    9         [loginThread removeObserver:self
   10                          forKeyPath:@"isFinished"];
   11                         
   12         [self processDeliciousData:loginThread.returnData];
   13     } else {
   14         [super observeValueForKeyPath:keyPath
   15                              ofObject:object 
   16                                change:change
   17                               context:context];
   18     }
   19 }
   20

The method observeValueForKeyPath: ofObject: change: context: allows us to be notified through KVO when our object has completed. In this instance I am only concerned about making sure that the path is "isFinished" and that the loginThread I created earlier is the object in question (line 6), because of that once I've gotten this notification I only need to remove myself as an observer and do what I intended to do with the data I got back from the operation. Beyond that on line 14 I only need make sure that if the operation hasn't finished and it's not our object that I just pass on the KVO notification up the chain of interested objects. The other alternative is to do like Marcus Zarra has demoed on Cocoa Is My Girlfriend and create a shared instance, so as to provide a reference that you can call performSelectorOnMainThread on and return control back to the main thread. Neither option is the wrong one, it just depends on which method you like better and how many operation objects you are tracking. What's better in my opinion is when you create your own dependency tree and use KVO notifications to be notified when the root dependency has been completed. This brings me to my next point NSOperation dependencies. NSThread has no built in mechanism for adding dependencies as of right now. However NSOperation has the - (void)addDependency:(NSOperation *)operation method which allows for an easy mechanism (when used with NSOperationQueue) for dependency management. So with that lets get into NSOperationQueue... NSOperationQueue Unless you intend to manage NSOperation objects yourself, NSOperationQueue will be your path to managing threads and has several API's for dealing with how much concurrency you can tolerate in your app, blocking a thread until all operation objects have finished, etc. When you place NSOperation objects onto the queue it will go through your objects and check for ones that have no dependencies or have dependencies that have already completed. In this way you can build your own custom dependency tree and toss all the objects onto the queue and NSOperationQueue will follow and obey how much concurrency you can handle. As noted earlier you only need to use - (void)addDependency:(NSOperation *)operation to indicate this to the operation queue. By default NSOperationQueue will spawn off as many threads as your system has cores, however if you need more or less or would just like to be able to specify that in your application programatically you can use the - (void)setMaxConcurrentOperationCount:(NSInteger)count and set concurrency count yourself, although Apple reccommends that you use the value NSOperationQueueDefaultMaxConcurrentOperationCount which automatically adjusts the number of operation objects that execute dynamically as your system load levels go up and down. Another useful API is the ability to put a bunch of Operation objects onto the que and wait for them to finish halting the method you are in while you wait for this to happen. The - (void)waitUntilAllOperationsAreFinished method was designed for this. I would recommend that you perform this on a 2nd+ thread so as to not interrupt the main thread while you wait for operation objects to complete. Since NSOperation is KVO compliant, it only make sense that NSOperationQueue be KVO compliant as well. And the most useful binding NSOperationQueue has is the ability to bind to all the NSOperation objects being executed with its operations bind-able property. This is somewhat analogous to Mails Activity viewer ( Command + 0 ) interface. This enables you to provide information to your users about what is going on in your application if you are providing such an interface. Personally I would recommend that you do provide such an interface, but expose it as an optional interface in the same way that mail doesn't show the activity viewer by default but allows you to hit a shortcut and bring up the activity viewer.

Conclusion & References

I've touched over a lot things in this article, but even at this I am barely scratching the surface of multithreading. It's a very expansive topic indeed. But I hope I gave you a starting point here. In this article i've covered:

What threads are
When you should use threading
When you shouldn't use threading
What the Costs of Creating Threads are
A warning about Threading and it's implications
How Threads are implemented on Mac OS X
Mach Threads
POSIX Threads
Thread Locks
NSThread
NSOperation
NSOperationQueue

And yet so much still left to talk about. Below you'll find a simple concise list of links of API's and Classes mentioned throughout the article. NSRunLoop Technical Note TN2028: Threading Architectures (About Mach Threads) Thread Costs Thread Safety Thread Safe and Unsafe Classes Apples Threading Programming Guide Mac OS X Internals (Book) POSIX Threads (Wikipedia) POSIX Thread Header File ( pthread.h ) NSObject Threading Method NSThread Class Reference NSStringFromSelector Method NSLock NSRecursiveLock NSConditionLock POSIX Thread Lock Google Mac Blog on @synchronized performance slowdown Chris Kane @ Apple on KVO Thread Safety and KVO updates NSOperation NSOperationQueue Marcus Zarra Tutorial on NSOperation with shared instance reference

Credits

Thread memory layout graph based off of graph from CocoaDevCentral licensed under a Creative Commons License @ http://cocoadevcentral.com/articles/000061.php by H. Lally Singh

Cocoa Samurai