@interface AQBlog : NSBlog @end

Tutorials, musings on programming and ePublishing

Using AQXMLParser and Friends

Permalink

I've been asked a number of times to provide some examples on how to make use of AQXMLParser. A question which came in today has prompted me to make the effort and actually put together a simple project which demonstrates its use, along with its companion classes HTTPMessage and AQGzipInputStream.

The example project uses the asynchronous xml parser API to fetch my basic Tumblr post list, and displays the types and slugs of those posts in the UI (although it actually records all the data, in case some enterprising soul wants to actually make a detail view).

It fetches the data using a fairly simple set of code. I originally had it using a gzip stream, but it seems Tumblr never gzips its responses, so that didn't work out. I've left the appropriate code in there, albeit commented out, for your edification. In the snippets below, however, I've reinstated the gzip code.

The first thing to do is to create your URL and message:

ParserExampleAppDelegate.mlink
NSURL * url = [NSURL URLWithString: @"http://..."];
HTTPMessage * msg = [HTTPMessage requestMessageWithMethod: @"POST"
url: url
version: HTTPVersion1_1];
msg.useGzipEncoding = YES;

If you want to receive gzipped data (and if the service obeys such commands) then using msg.useGzipEncoding = YES will configure the relevant HTTP request headers for you.

The next step is to get a response stream from that message and potentially wrap it in a gzip stream if you've asked for gzipped data:

ParserExampleAppDelegate.mlink
// get the stream and wrap it in a gzip decoder
NSInputStream * bareStream = [msg inputStream];
AQGzipInputStream * gzipStream = [[AQGzipInputStream alloc] initWithCompressedStream: bareStream];

The parser itself is then built around the stream we just obtained. In this instance we'll use a subclass, AQXMLParserWithTimeout, which implements a timeout or its own rather than waiting for the underlying protocol to decide it won't wait any further. Note that the parser in this example is a member variable, since we're using an asynchronous API and want to have a clean retain-release-autorelease contract in place with the API.

ParserExampleAppDelegate.mlink
parser = [[AQXMLParserWithTimeout alloc] initWithStream: gzipStream];
[gzipStream release]; // done with this now
parser.timeout = 30.0;

Next up is the parser delegate, which we'll see in more detail shortly. AQXMLParser is an event-based XML parser, so a delegate is required in order to get anything out of the parsed data. Again, we're storing the parser delegate in a member variable and releasing it when we know we're done with it.

ParserExampleAppDelegate.mlink
parserDelegate = [[TumblrParserDelegate alloc] init];
// set any parser delegate properties/variables here
parser.delegate = parserDelegate;

If we were on a background thread already, we could run the parser in synchronous mode by simply calling [parser parse]. Since we're working asynchronously on a single thread however, we will use a somewhat more wordy API:

ParserExampleAppDelegate.mlink
if ( [parser parseAsynchronouslyUsingRunLoop: [NSRunLoop currentRunLoop]
mode: NSDefaultRunLoopMode
notifyingDelegate: self
selector: @selector(parser:completedOK:)
context: NULL] == NO )
{
NSError * error = parser.parserError;
UIAlertView * alert = [[UIAlertView alloc] initWith...];
[alert show];
[alert release];
[parser release]; parser = nil;
[parserDelegate release]; parserDelegate = nil;
}

At this point we sit back and wait for our callback to be called. Unlike in the example above, the callback can have three parameters: the two mentioned above (parser and success/failure state) and a context pointer. I'll add a blocks-based version shortly, I think.

When the parser finishes reading the response and handing the data on to the delegate, it will call the specified callback routine with the results. Here's the one from the sample application in its entirety:

ParserExampleAppDelegate.mlink
- (void) parser: (AQXMLParser *) aParser completedOK: (BOOL) completedOK
{
[[UIApplication sharedApplication] setNetworkActivityIndicatorVisible: NO];
if ( !completedOK )
{
NSError * error = parser.parserError;
UIAlertView * alert = [[UIAlertView alloc] initWith...];
[alert show];
[alert release];
}
// set the root view's title
navigationController.topViewController.title = [parserDelegate.tumblog valueForKey: @"title"];
[parser autorelease];
parser = nil;
[parserDelegate autorelease];
parserDelegate = nil;
if ( backgroundTaskID != UIBackgroundTaskInvalid )
[[UIApplication sharedApplication] endBackgroundTask: backgroundTaskID];
}

Our next step is to investigate the delegate. This is a subclass of AQXMLParserDelegate, which implements some nice magic for us. It takes the name of an opening or closing XML tag and converts it into a message name, such as startXXXXWithAttributes: or endXXXX. For tag names which contain hyphens, it removes the hyphen and camelCases the name, meaning that <xml-tag> and </xml-tag> will generate startXmlTagWithAttributes: and endXmlTag. The delegate code then checks whether it responds to the message name it's just generated, and if so it will call it.

Additionally, the delegate accumulates the contents of tags as a string for you, so in your endXXXX method you can access that text using self.characters. It correctly handles CDATA blocks, so you don't have to worry about parsing those out or trimming CDATA tags from the midst of anything.

Below are a couple of examples from the TumblrParserDelegate class in the example project:

TumblrParserDelegate.mlink
- (void) startTumblelogWithAttributes: (NSDictionary *) attrs
{
NSManagedObject * obj = [self tumblogWithName: [attrs objectForKey: @"name"]];
if ( obj == nil )
obj = [NSEntityDescription insertNewObjectForEntityForName: @"Tumblelog"
inManagedObjectContext: self.managedObjectContext];
self.tumblog = obj;
[self.tumblog setValue: [attrs objectForKey: @"name"]
forKey: @"name"];
[self.tumblog setValue: [attrs objectForKey: @"title"]
forKey: @"title"];
[self.tumblog setValue: [attrs objectForKey: @"timezone"]
forKey: @"timezone"];
}
TumblrParserDelegate.mlink
- (void) endRegularTitle
{
AssertEntityType(@"Regular");
[self.currentPost setValue: self.characters
forKey: @"title"];
}

That's all there is to it! And of course you get all the benefits of a proper streaming parser, such as parsing data while it's being received on the wire rather than accumulating megabytes of XML data in memory before looking at it. Fast, low-overhead. Nice.

The full code can be found here.

Comments