153

iOS Daemon Journey

A journey to investigate iOS daemon memory issue

iOS Daemon Journey

notice this article is based on Jailbreak iOS devices up to 9.0.3

Context

Recently, I have a small project that plays with iOS daemon.

The requirement is we use an open source S3 server, and the iOS daemon could upload files via AWS S3 iOS SDK.

The coding part is straightforward, but when testing, we found one serious issue: the daemon keeps crashing even when uploading 9MB files. There is no crash report generated, only from the syslog we can see the daemon is crashed and following a message from jetsam.

Then I tried to hook up a debug server and use lldb to debug my daemon. When I manually enter code to execute the upload function, lldb warns me a few seconds after with:

terminated due to memory issue

I was totally confused, but I guess it’s because reading the files inside AWS SDK.

How AWS iOS SDK upload files

AWS iOS SDK handles upload via AWSS3TransferManager and AWSS3TransferManagerUploadRequest.

It also have a property called AWSS3TransferManagerMinimumPartSize to control the minimum part size.

if (fileSize > AWSS3TransferManagerMinimumPartSize) {
    return [weakSelf multipartUpload:uploadRequest fileSize:fileSize cacheKey:cacheKey];
} else {
    return [weakSelf putObject:uploadRequest fileSize:fileSize cacheKey:cacheKey];
}

If the file is larger than the minimum part size, it will invoke

  • (AWSTask *)multipartUpload:(AWSS3TransferManagerUploadRequest *)uploadRequest fileSize:(unsigned long long) fileSize cacheKey:(NSString )cacheKey instead.

Inside multipartUpload(), it will cut the file into small pieces and resave them into files under:

[NSURL fileURLWithPath:[NSTemporaryDirectory() stringByAppendingPathComponent:[[NSUUID UUID] UUIDString]]]

It uses NSFileHandle to split the file. It turned out that this will potentially cause the memory issue eventually, though I have no strong proof.

I tried to add some log in syslog to print around the NSFileHandle operations, and I always see after calling NSData *partData = [fileHandle readDataOfLength:dataLength] or [fileHandle closeFile], the daemon crashed and restarted.

Tweaking AWSS3TransferManagerMinimumPartSize

Since we saw AWSS3TransferManagerMinimumPartSize, I first decided to reduce the value to 2MB, and everything seems fine. I then change it to 3MB, at first it works fine, but after uploading several parts, daemon crash and restarted again, looping the upload function over and over, and one important thing is I found it does not crash at same iteration: for instance, sometimes it can upload 20 parts, but after crash and restart, it can only upload 17 parts. This gives me the hint that the daemon should be killed by purpose, especially by iOS kernel (Thanks to Linux OOM killer knowledge I learned when I was in EMC)

So I think the daemon must have some sort of memory limit set by kernel, but how do we change it, if possible?

AFNetworking is truly amazing

Before I use S3 to upload, I used to test the file upload using AFNetworking. I remember It can upload some binaries like 20+ also using multipart upload, which confuses me. I then write up a small Django server to receive files from AFNetworking interface:

NSURLSessionConfiguration *configuration = [NSURLSessionConfiguration defaultSessionConfiguration];
AFHTTPSessionManager *manager = [[AFHTTPSessionManager alloc] initWithSessionConfiguration:configuration];
NSURLSessionDataTask *dataTask = [manager POST:serverURL parameters:nil constructingBodyWithBlock:^(id<AFMultipartFormData>  _Nonnull formData) {
    [formData appendPartWithFileURL:fileURL name:uploadName error:nil];
} success:^(NSURLSessionDataTask * _Nonnull task, id  _Nonnull responseObject) {
} failure:^(NSURLSessionDataTask * _Nonnull task, NSError * _Nonnull error) {
}];
[dataTask resume];

The result is astonishing: a 200MB file is uploaded to my Django server. This even more puzzles me that how can one work with 200MB while the other one failed with 20MB?

S3 upload in app

I decides to write a simple app to try how S3 SDK behaves in a normal app. After a few lines of code, I found that the app works just fine. This gives another hint: daemon and app are treated very differently.

Some theories so far

Given the facts above, we have some theories:

  1. The daemon is not restricted as we thought earlier that much. Since AFNetworking can do the job, S3 SDK should too, theoretically.
  2. NSFileHandle may have leak and trigger kernel to kill (if we do not test it in app)
  3. AWS S3 iOS SDK could have serious memory issue in its abused usage of continueWithSuccessBlock and not properly handle and free large NSData objects in each multi part request. I personally think the evil is in recursive blocks. When you use it correctly, it’s very powerful, but when you made a mistake, it will be horrible to find out. AWS devs may only tested under app and happy to see all UT passed, not realizing this situation I encountered It’s not likely Apple can have leak issue for NSFileHandle. It’s merely a a file descriptor wrapper according to the documentation. So we first rule out #2 and leave it behind. For #3, even though it really has bugs, I’m not able to fix it for the obvious reason: no time to find out where they hold the objects when they should release them. It also could mean there is no bugs, it’s just that app process have more tolerance than daemon.

Our backend engineer suggest we abandon S3, in stead, we could have a small server to get the files and put it to S3 later. I for a time have the same thoughts to give up investigating and blame AWS for its awful SDK. But I decides to give one more try for #1

Save the day

Actually I already started working on the theory #1 when I found the interesting facts for AWS SDK. I think if I was able to find out how to increase daemon memory limit in iOS, the problem will be gone. But how? As known to all of us, iOS is not public, not even mentioning it’s impossible for app developer have the chance to think about running daemon on iOS. Except for those SDKs, frameworks and general OS knowledge and concepts we work with in daily life, we actually know nothing about iOS internals for a normal app developer.

But thanks to the jailbreak community, we are able to take a deep peek inside iOS.

Revisit an old friend and pioneer

When I was working in EMC, I took a one-week training of Linux Kernel Debugging. the lecturer is Jonathan Levin, who is the pioneer and a true master in my opinion. He knows almost everything, literally, about OS kernels among Windows, Linux, and OS X (later it’s macOS) and iOS. While I was in his training, I learned that he wrote a book called “OS X and iOS internals”. At that time I was obsessed with Linux kernels, and knows OS X has its unix blood. I was very interested in Apple’s implementation after OS X, so I bought this book and read half of them. Fortunately, He keeps digging into iOS internals and published one article [Handling low memory conditions in iOS and Mavericks]

From the article I learned some facts:

  1. Daemon and other services do have memory limit set by iOS kernel, and jstsam is the killer, which proves what we saw in syslog in the beginning.
  2. /System/Library/LaunchDaemons/com.apple.jetsamproperties.{MODEL}.plist where {MODEL} can be N51 (5s), J71 (iPad Air), etc sets the value of priority, memory limit, etc.
  3. memorystatus_control is a system call that is not documented, but can be found in kern_memorystatus.h:

    Introduced somewhere around xnu 2107 (that is, as early as iOS 6 but not until OS X 10.9), this (undocumented) syscall enables you to control both memory status and jetsam (the latter, on iOS) OK, now we have a wild guess how to make it work. But first we need to know, is our daemon really restricted by jetsam?

Getting the memory status

Luckily I found Jonathan already wrote a program called [mlist.c]. It can print out memorystatus_priority_entry for each process on macOS/iOS. However, it’s kind of old and targeting for iOS 6, but we are on iOS 9 now. So what do we do now? Remember this sort of knowledge can be found on opensource.apple.com, and the first thing we need to do is check what’s the XNU version of iOS 9. calling uname -a on iPhone, it’s showing:

Darwin Kernel Version 15.0.0: Thu Aug 20 13:11:14 PDT 2015; root:xnu-3248.1.31/RELEASE_ARM64_S5L8960X iPhone6,1 arm64 N51AP Darwin It’s 3248.1.3! Let’s see what we have on [XNU]: the closest version is xnu-3248.20.55, though it’s much newer (1.3 to 20.55) but the interfaces should not change a lot. Diving into [xnu-3248.20.55/bsd/sys/kern_memorystatus.h]we found:

typedef struct memorystatus_priority_entry {
	pid_t pid;
	int32_t priority;
	uint64_t user_data;
	int32_t limit;
	uint32_t state;
} memorystatus_priority_entry_t;

It seems memorystatus_priority_entry should work on iOS 9. Now let’s get the job done (just borrow code from mlist.c) (don’t forget to copy kern_memorystatus.h along with):

mtool.c
 
#include <stdlib.h>
#include <string.h>
#include <stdio.h>
#include "kern_memorystatus.h"
 
#define NUM_ENTRIES 1024
 
char *state_to_text(int State)
{
    // Convert kMemoryStatus constants to a textual representation
    static char returned[80];
 
    sprintf (returned, "0x%02x ",State);
 
    if (State & kMemorystatusSuspended) strcat(returned,"Suspended,");
    if (State & kMemorystatusFrozen) strcat(returned,"Frozen,");
    if (State & kMemorystatusWasThawed) strcat(returned,"WasThawed,");
    if (State & kMemorystatusTracked) strcat(returned,"Tracked,");
    if (State & kMemorystatusSupportsIdleExit) strcat(returned,"IdleExit,");
    if (State & kMemorystatusDirty) strcat(returned,"Dirty,");
 
    if (returned[strlen(returned) -1] == ',')
        returned[strlen(returned) -1] = '\0';
 
    return (returned);
}
 
int main (int argc, char **argv)
{
    struct memorystatus_priority_entry memstatus[NUM_ENTRIES];
    size_t  count = sizeof(struct memorystatus_priority_entry) * NUM_ENTRIES;
 
    // call memorystatus_control
    int rc = memorystatus_control (MEMORYSTATUS_CMD_GET_PRIORITY_LIST,    // 1 - only supported command on OS X
                                   0,    // pid
                                   0,    // flags
                                   memstatus, // buffer
                                   count); // buffersize
 
    if (rc < 0) { perror ("memorystatus_control"); exit(rc);}
 
    int entry = 0;
    for (; rc > 0; rc -= sizeof(struct memorystatus_priority_entry))
    {
        printf ("PID: %5d\tPriority:%2d\tUser Data: %llx\tLimit:%2d\tState:%s\n",
                memstatus[entry].pid,
                memstatus[entry].priority,
                memstatus[entry].user_data,
                memstatus[entry].limit,
                state_to_text(memstatus[entry].state));
        entry++;   
    }
}

Note This program can work on macOS directly, but in order to run this program on iOS, we need to make a tool tweak to make it work.

Now we find out our daemon process and print out its memory status priority entry:

iOS-06:~ root# mtool |grep 32167
PID: 32167	Priority: 0	User Data: 0	Limit: 6	State:0x18 Tracked,IdleExit

Wait what? Priority 0, and Limit is only 6? You are a clear shoot target for jetsam! That also remind me when I change AWSS3TransferManagerMinimumPartSize to no matter 2MB or 1MB, the daemon keeps killed, so we know AWS S3 SDK indeed does not release resources while uploading. Time to fix this, AWS Devs! Look at your neighbor AFNetworking :) How about our S3 uploader app?

iOS-06:~ root# ps -ax|grep S3
32604 ??         0:00.58 /var/mobile/Containers/Bundle/Application/FF27DD3B-099E-4047-A31A-826868050209/S3Uploader.app/S3Uploader
32606 ttys002    0:00.01 grep S3
iOS-06:~ root# mtool |grep 32604
PID: 32604	Priority:10	User Data: 10100	Limit:650	State:0x00 

I think we have the answer: iOS apps could have Limit 650 and AWS S3 SDK could live in this happy land.

Giving more juice

The final work would be, changing daemon memory limit, e.g. 650. Thanks to Apple, SET_MEMLIMIT_PROPERTIES is exported on iOS 9. Code is simple:

#include "kern_memorystatus.h"
int pid = (int)getpid();
// call memorystatus_control
memorystatus_memlimit_properties_t memlimit;
memlimit.memlimit_active = 650;
memlimit.memlimit_inactive = 650;
DDLog(@"setting memory limit for MIWorker pid:%d", pid);
int rc = memorystatus_control (MEMORYSTATUS_CMD_SET_MEMLIMIT_PROPERTIES,
                               pid,  // pid
                               0,  // flags
                               &memlimit,  // buffer
                               sizeof(memlimit));  // buffersize
 
DDLog(@"DONE: setting memory limit for MIWorker pid:%d, rc:%d", pid, rc);

And check out again:

iOS-06:~ root# mtool |grep 32900
PID: 32900	Priority: 0	User Data: 0	Limit:650	State:0x18 Tracked,IdleExit

It works! Then try to upload a 215MB file, success! It works like a charm.

Conclusion but not the end

Though the S3 upload code is very simple and straightforward, we met major issues from iOS jetsam. Luckily Jonathan saved my day by giving me the right direction. I had lot of pain and fun playing with C, Objective-C code and remind me old days in EMC dancing with the kernel.

Future work

Change plist to set memory limit for daemons For further study, here’s the implementation of [kern_memorystatus.c] (XNU 3248.20.55)