前段時間想找點事做,就是試著看能不能用豆瓣的API做點什麼,於是就碰到了這個問題——XML解析。
老師還沒講,只能自己去查。
XML文檔解析主要有SAX和DOM兩種模式,IOS上兩種模式都可以用,這裡就不做過多介紹,我選擇的SAX模式。
IOS解析XML用的是自帶的NSXML框架,框架的核心是NSXMLParser類和它的委托協議NSXMLParserDelegate,其主要的解析工作是在NSXMLParserDelegate實現類中完成的。委托中定義了許多回掉方法,在SAX解析器從上到下遍歷XML文檔的過程中,遇到開始標簽、結束標簽、文檔開始、文檔結束和字符串結束是就會觸發這些方法。這些方法有很多,下面我們列出5個常用的方法。
在文檔開始時觸發
-(void)parserDidStartDocument:(NSXMLParser *)parser
遇到一個新標簽是觸發,其中namespaceURI是命名空間,qualifiedName是限定名,attributes是字典類型的屬性集合。
-(void)parser:(NSXMLParser *)parser didStartElement:(NSString *)elementName namespaceURI:(NSString *)namespaceURI qualifiedName:(NSString *)qName attributes:(NSDictionary *)attributeDict
找到字符串時觸發
-(void)parser:(NSXMLParser *)parser foundCharacters:(NSString *)string
遇到結束標簽時觸發
-(void)parser:(NSXMLParser *)parser didEndElement:(NSString *)elementName namespaceURI:(NSString *)namespaceURI qualifiedName:(NSString *)qName
在文檔結束時觸發
-(void)parserDidEndDocument:(NSXMLParser *)parser
下面通過一個具體的例子來看整個的調用與解析過程
首先這是我們將要解析的XML文件 "info.xml"
<?xml version="1.0" encoding="UTF-8"?> <root> <person id="1"> <firstName>Wythe</firstName> <lastName>xu</lastName> <age>22</age> </person> <person id="2"> <firstName>li</firstName> <lastName>si</lastName> <age>31</age> </person> <person id="3"> <firstName>Dipen</firstName> <lastName>Shah</lastName> <age>24</age> </person> </root>
接來來是一個頭文件 "ViewController.h"
#import <UIKit/UIKit.h> @interface ViewController : UIViewController<NSXMLParserDelegate> @property NSXMLParser *parser; @property NSMutableArray *person; @property NSString *currenttag; @end
然後是它的實現文件 "ViewController.m"
#import "ViewController.h" @interface ViewController () @end @implementation ViewController @synthesize parser = _parser , person = _person , currenttag = _currenttag; - (id)initWithNibName:(NSString *)nibNameOrNil bundle:(NSBundle *)nibBundleOrNil { self = [super initWithNibName:nibNameOrNil bundle:nibBundleOrNil]; if (self) { // Custom initialization } return self; } - (void)viewDidLoad { [super viewDidLoad]; NSString *xmlFilePath = [[NSBundle mainBundle]pathForResource:@"info"ofType:@"xml"]; NSData *data = [[NSData alloc]initWithContentsOfFile:xmlFilePath]; self.parser = [[NSXMLParser alloc]initWithData:data]; self.parser.delegate = self; [self.parser parse]; NSLog(@"%@",_person); } - (void)didReceiveMemoryWarning { [super didReceiveMemoryWarning]; // Dispose of any resources that can be recreated. } #pragma mark delegate method -(void)parserDidStartDocument:(NSXMLParser *)parser { _person = [[NSMutableArray alloc]init]; NSLog(@"start parse 1"); } -(void)parser:(NSXMLParser *)parser didStartElement:(NSString *)elementName namespaceURI:(NSString *)namespaceURI qualifiedName:(NSString *)qName attributes:(NSDictionary *)attributeDict { _currenttag = elementName; if ([_currenttag isEqualToString:@"person"]) { NSString *_id = [attributeDict objectForKey:@"id"]; NSMutableDictionary *dict = [[NSMutableDictionary alloc]init]; [dict setObject:_id forKey:@"id"]; [_person addObject:dict]; } NSLog(@"start element"); } -(void)parser:(NSXMLParser *)parser foundCharacters:(NSString *)string { NSMutableDictionary *dict = [_person lastObject]; if ([_currenttag isEqualToString:@"firstName"] && dict) { [dict setObject:string forKey:@"firstName"]; } if ([_currenttag isEqualToString:@"lastName"] && dict) { [dict setObject:string forKey:@"lastName"]; } if ([_currenttag isEqualToString:@"age"] && dict) { [dict setObject:string forKey:@"age"]; } NSLog(@"found characters"); } -(void)parser:(NSXMLParser *)parser didEndElement:(NSString *)elementName namespaceURI:(NSString *)namespaceURI qualifiedName:(NSString *)qName { _currenttag = nil; NSLog(@"end element"); } -(void)parserDidEndDocument:(NSXMLParser *)parser { NSLog(@"parse end"); } @end
通過斷電和輸出信息,我們可以知道整個解析過程是 開始解析文檔、開始標簽、找到字符串、結束標簽、文檔結束。
2014-09-10 16:45:32.920 xmlforblog[3820:60b] start parse 1 2014-09-10 16:45:32.921 xmlforblog[3820:60b] start element 2014-09-10 16:45:32.922 xmlforblog[3820:60b] found characters 2014-09-10 16:45:32.922 xmlforblog[3820:60b] start element 2014-09-10 16:45:32.922 xmlforblog[3820:60b] found characters 2014-09-10 16:45:32.922 xmlforblog[3820:60b] start element 2014-09-10 16:45:32.923 xmlforblog[3820:60b] found characters 2014-09-10 16:45:32.923 xmlforblog[3820:60b] end element 2014-09-10 16:45:32.923 xmlforblog[3820:60b] found characters 2014-09-10 16:45:32.923 xmlforblog[3820:60b] start element 2014-09-10 16:45:32.924 xmlforblog[3820:60b] found characters 2014-09-10 16:45:32.924 xmlforblog[3820:60b] end element 2014-09-10 16:45:32.924 xmlforblog[3820:60b] found characters 2014-09-10 16:45:32.924 xmlforblog[3820:60b] start element 2014-09-10 16:45:32.925 xmlforblog[3820:60b] found characters 2014-09-10 16:45:32.925 xmlforblog[3820:60b] end element 2014-09-10 16:45:32.925 xmlforblog[3820:60b] found characters 2014-09-10 16:45:32.925 xmlforblog[3820:60b] end element 2014-09-10 16:45:32.926 xmlforblog[3820:60b] found characters 2014-09-10 16:45:32.926 xmlforblog[3820:60b] start element 2014-09-10 16:45:32.928 xmlforblog[3820:60b] found characters 2014-09-10 16:45:32.929 xmlforblog[3820:60b] start element 2014-09-10 16:45:32.929 xmlforblog[3820:60b] found characters 2014-09-10 16:45:32.929 xmlforblog[3820:60b] end element 2014-09-10 16:45:32.930 xmlforblog[3820:60b] found characters 2014-09-10 16:45:32.930 xmlforblog[3820:60b] start element 2014-09-10 16:45:32.930 xmlforblog[3820:60b] found characters 2014-09-10 16:45:32.930 xmlforblog[3820:60b] end element 2014-09-10 16:45:32.931 xmlforblog[3820:60b] found characters 2014-09-10 16:45:32.931 xmlforblog[3820:60b] start element 2014-09-10 16:45:32.931 xmlforblog[3820:60b] found characters 2014-09-10 16:45:32.931 xmlforblog[3820:60b] end element 2014-09-10 16:45:32.931 xmlforblog[3820:60b] found characters 2014-09-10 16:45:32.932 xmlforblog[3820:60b] end element 2014-09-10 16:45:32.932 xmlforblog[3820:60b] found characters 2014-09-10 16:45:32.932 xmlforblog[3820:60b] start element 2014-09-10 16:45:32.932 xmlforblog[3820:60b] found characters 2014-09-10 16:45:32.933 xmlforblog[3820:60b] start element 2014-09-10 16:45:32.933 xmlforblog[3820:60b] found characters 2014-09-10 16:45:32.933 xmlforblog[3820:60b] end element 2014-09-10 16:45:32.933 xmlforblog[3820:60b] found characters 2014-09-10 16:45:32.934 xmlforblog[3820:60b] start element 2014-09-10 16:45:32.934 xmlforblog[3820:60b] found characters 2014-09-10 16:45:32.934 xmlforblog[3820:60b] end element 2014-09-10 16:45:32.934 xmlforblog[3820:60b] found characters 2014-09-10 16:45:32.935 xmlforblog[3820:60b] start element 2014-09-10 16:45:32.935 xmlforblog[3820:60b] found characters 2014-09-10 16:45:32.935 xmlforblog[3820:60b] end element 2014-09-10 16:45:32.935 xmlforblog[3820:60b] found characters 2014-09-10 16:45:32.936 xmlforblog[3820:60b] end element 2014-09-10 16:45:32.936 xmlforblog[3820:60b] found characters 2014-09-10 16:45:32.936 xmlforblog[3820:60b] end element 2014-09-10 16:45:32.936 xmlforblog[3820:60b] parse end 2014-09-10 16:45:32.936 xmlforblog[3820:60b] ( { age = 22; firstName = Wythe; id = 1; lastName = xu; }, { age = 31; firstName = li; id = 2; lastName = si; }, { age = 24; firstName = Dipen; id = 3; lastName = Shah; } )執行結果
而我們的處理主要是在開始標簽、找到字符串 (
-(void)parser:(NSXMLParser *)parser didStartElement:(NSString *)elementName namespaceURI:(NSString *)namespaceURI qualifiedName:(NSString *)qName attributes:(NSDictionary *)attributeDict
-(void)parser:(NSXMLParser *)parser foundCharacters:(NSString *)string
) 中。
遇到開始標簽時,我們現判斷標簽,名字,如果是person,表明接下來就是person的信息,這樣我們就先創建一個可變字典,以便將來存放它的值。
-(void)parser:(NSXMLParser *)parser didStartElement:(NSString *)elementName namespaceURI:(NSString *)namespaceURI qualifiedName:(NSString *)qName attributes:(NSDictionary *)attributeDict { _currenttag = elementName; if ([_currenttag isEqualToString:@"person"]) { NSString *_id = [attributeDict objectForKey:@"id"]; NSMutableDictionary *dict = [[NSMutableDictionary alloc]init]; [dict setObject:_id forKey:@"id"]; [_person addObject:dict]; } NSLog(@"start element"); }
在找到字符串時,我們就是通過判斷當前標簽名,將對應的信息保存到剛剛創建的字典中
-(void)parser:(NSXMLParser *)parser foundCharacters:(NSString *)string { NSMutableDictionary *dict = [_person lastObject]; if ([_currenttag isEqualToString:@"firstName"] && dict) { [dict setObject:string forKey:@"firstName"]; } if ([_currenttag isEqualToString:@"lastName"] && dict) { [dict setObject:string forKey:@"lastName"]; } if ([_currenttag isEqualToString:@"age"] && dict) { [dict setObject:string forKey:@"age"]; } NSLog(@"found characters"); }
不斷循環這樣的過程,最後我們就可以解析出整個XML文檔。
另外說一句,這只是解析一般的文檔,如果你跟我曾經一樣學會這個就去解析豆瓣API的XML文檔,會發現行不通。這時因為許多網站因為它的數據較多,為了避免標簽的重復,使用了命名空間,帶有命名空間的XML文檔解析和這稍有不同。