加载中

Swift is Apple’s new programming language, said by many to ‘replace’ Objective-C. This is not the case. I’ve spent some time reverse engineering Swift binaries and the runtime, and I’ve found out quite a bit about it. So far, the verdict is this; Swift is Objective-C without messages.

Objects

Believe it or not, Swift objects are actually Objective-C objects. In a Mach-O binary, the __objc_classlist section contains data for each class in the binary. The structure is like so:

struct objc_class {
    uint64_t isa;
    uint64_t superclass;
    uint64_t cache;
    uint64_t vtable;
    uint64_t data;
};

(note: all structures are from 64-bit builds)

Note the data entry. It points to a structure listing the methods, ivars, protocols, etc. of the class. Normally, data is 8-byte-aligned. However, for Swift classes, the last bit of data will be 1.

Swift是苹果公司最新推出的编程语言,据很多人说,是用来”代替“Objective-C。但是没有确切的证据。我花了一些时间对Swift二进制和运行环境实施逆向工程技术,然后我对Swift有些少许的发现。目前为止,结论就是:Swift是没有消息机制的Objective-C。

对象

信不信由你,Swift中的对象就是Objective-C的对象。在Mach-O二进制文件中,__objc_classlist包含每个二进制文件中类的数据。其结构如下所示:

struct objc_class {
    uint64_t isa;
    uint64_t superclass;
    uint64_t cache;
    uint64_t vtable;
    uint64_t data;
};

(注:所有结构都来自64位版本)

注意data记录,它指向了类中的一个列出方法、实例变量和协议等内容的结构体。通常,data是8个字节对齐的,但是对于Swift类,data的最后一位仅为1个字节。

Classes

The actual structure for Swift classes is a bit odd. Swift classes have no Objective-C methods. We’ll get to that later. Variables for Swift classes are stored as ivars. The Swift getter and setter methods actually modify the ivar values. Oddly, ivars for Swift classes have no type encoding. The pointer that is normally supposed to point to the type encoding is NULL. This is presumably due to the fact that the Objective-C runtime is not supposed to deal with Swift variables itself.

Inheritance

Inheritance in Swift is as you would expect. In Swift, a Square that is a subclass of Shape will also be a subclass of Shape in the Objective-C class. However, what if a class in Swift doesn’t have a superclass?

e.g.

class Shape { }

In this case, the Shape class would be a subclass of SwiftObject. SwiftObject is a root Objective-C class, similar to NSObject. It has no superclass, meaning the isa points to itself. Its purpose is to use Swift runtime methods for things like allocation and deallocation, instead of the standard Objective-C runtime. For example, - (void)retain does not call objc_retain, but instead calls swift_retain.

Swift类的真正结构是有一点奇怪的。Swift类没有Objective-C方法。我们将在以后实现它。Swift类的变量存储为实例变量。Swift的getter和setter方法真正修改的是实例变量的值。奇怪的是swift类的实例变量没有类型编码。通常应该指向类型编码的指针为NULL。这大概是由于事实上Objective-C运行时是不支持处理Swift变量本身。

继承

Swift的继承是你所期待的。在Swift中,Square是shape的子类也是Objective-C类Shape的子类。然而,在Swift类中没有超类?

例如

class Shape { }

在这个例子中,Shape类是SwiftObject的子类。SwiftObject是一个根Objective-C类,类似于NSObject。它没有超类,意味着isa指向自身。它的目的是使用Swift运行时方法比如allocation和deallocation代替标准的Objective-C运行时方法。例如,(void)retain不会调用objc_retain,但是它会调用swift_retain。

Class Methods

Like I mentioned earlier, classes for Swift objects have no methods. Instead, they have been replaced with C++-like functions, mangling and all. This is likely why Swift has been said to be much faster than Objective-C; there is no more need for objc_msgSend to find and call method implementations.

In Objective-C, method implementations are like so:

type method(id self, SEL _cmd, id arg1, id arg2, ...)

Swift methods are very similar, but with a slightly different argument layout. self is passed as the last argument, and there is no selector.

type method(id arg1, id arg2, ..., id self)

类方法

就像我之前提到的,Swift对象的类没有方法,以此代替的是类似C++的函数,名称改编和所有东西。这可能是为什么Swift声称比Objective-C更快的原因。不再需要为 objc_msgSend  寻找和调用方法实现。

在Objective-C里面,方法像这样实现:

type method(id self, SEL _cmd, id arg1, id arg2, ...)

Swift 方法非常类似,但是轻微使用了不同的参数排布, self 作为最后一个参数传递,并且没有选择器。 

type method(id arg1, id arg2, ..., id self)

vtable

Just like in C++, Swift classes have a vtable which lists the methods in the class. It is located directly after the class data in the binary, and looks something like this:

struct swift_vtable_header {
    uint32_t vtable_size;
    uint32_t unknown_000;
    uint32_t unknown_001;
    uint32_t unknown_002;
    void* nominalTypeDescriptor;
    // vtable pointers
}

From what I can tell, the vtable for a Swift class is only used when it is visible during compile time. Otherwise, it finds the mangled symbol.

虚表

类似C++一样,Swift类也具有一个虚表,用于列出类中的方法。它直接被放置在二进制文件中的类数据之后,并且看起来是这样的:

struct swift_vtable_header {
    uint32_t vtable_size;
    uint32_t unknown_000;
    uint32_t unknown_001;
    uint32_t unknown_002;
    void* nominalTypeDescriptor;
    // vtable pointers
}

据我所知,Swift类中的虚表仅在编译期间可见时被使用。否则,它将看起来就是一堆乱糟糟的符号。

Name Mangling

Swift keeps metadata about functions (and more) in their respective symbols, which is called name mangling. This metadata includes the function’s name (obviously), attributes, module name, argument types, return type, and more. Take this for example:

class Shape{
    func numberOfSides() -> Int {
        return 5
    }
}

The mangled name for the simpleDescription method is _TFC9swifttest5Shape17simpleDescriptionfS0_FT_Si. Here’s the breakdown:

_T – The prefix for all Swift symbols. Everything will start with this.

F – Function.

C – Function of a class. (method)

9swifttest – The module name, with a prefixed length.

5Shape – The class name the function belongs to, again, with a prefixed length.

17simpleDescription – The function name.

f – The function attribute. In this case it’s ‘f’, which is just a normal function. We’ll get to that in a minute.

S0_FT – I’m not exactly sure what this means, but it appears to mark the start of the arguments and return type.

‘_’ – This underscore separates the argument types from the return type. Since the function takes no arguments, it comes directly after S0_FT.

S – This is the beginning of the return type. The ‘S’ stands for Swift; the return type is a Swift builtin type. The next character determines the type.

i – This is the Swift builtin type. A lowercase ‘I’, which stands for Int.

命名重整

Swift保持函数的元数据在各自的符号,这就叫做命名重整。元数据宝库奥函数的名称(显而易见的),属性,模块名称,参数类型,返回值类型,还有更多的数据,例如这个例子

class Shape{
    func numberOfSides() -> Int {
        return 5
    }
}

simpleDescription方法的重整命名是:

_TFC9swifttest5Shape17simpleDescriptionfS0_FT_Si。下面是详细说明:

_T - 所有Swift符号的前缀,每一个符号都是从_T开始。

F - 函数

C - 类的函数(方法)

9swifttest - 带有长度前缀的模块名

5Shape - 函数所属的类,带有长度前缀

17simpleDescription - 函数名

f - 函数属性。 在这个例子中它是f,这是一个普通函数。

S0_FT- 我不是特别确定这是什么意思,但是它是参数和返回类型开始的标记

‘_' - 这个下划线分割了参数和返回值的类型。因为函数没有带参数,它直接跟在了S0_FT的后面

S - 返回值的开始。'S'代表Swift;返回类型是Swift的内建类型,下一个字符决定了类型

i - 这是Swift的内建类型。一个小写的"I"代表了Int.

Function Attributes

Character

Type

f Normal Function
s Setter
g Getter
d Destructor
D Deallocator
c Constructor
C Allocator

Swift Builtins

Character

Type

a Array
b Bool
c UnicodeScalar
d Double
f Float
i Int
u UInt
Q ImplicitlyUnwrappedOptional
S String

There’s a lot more to name mangling than just functions, but I’ve just given a brief overview.

Function Hooking

Enough with semantics, let’s get to the fun part! Let’s say we have a class like so:

class Shape {
    var numberOfSides: Int;

    init(){
        numberOfSides = 5;
    }
}

Let’s say we want to change the numberOfSides to 4. There are multiple ways to do this. We could use MobileSubstrate to hook into the getter method, and change the return value, like so:

int (*numberOfSides)(id self);

MSHook(int, numberOfSides, id self){
    return 4;
}

%ctor{
    numberOfSides = (int (*)(id self)) dlsym(RTLD_DEFAULT, "_TFC9swifttest5Shapeg13numberOfSidesSi");
    MSHookFunction(numberOfSides, MSHake(numberOfSides));
}

If we create an instance of Shape and print out the value of numberOfSides, we see 4! That wasn’t so bad, was it? Now, I know what you’re thinking; “aren’t you supposed to return an object instead of a 4 literal?”

函数属性

字符类型

f        普通函数

s        setter

g        getter

d        析构函数

D        释放器

c        构造函数

C        分配器

Swift内部函数

字符类型

a        数组

b        布尔型

c        字符常量

d        双精度浮点数

f        单精度浮点型

i        整型

u        UInt类型

Q        隐式可选

S        字符串型

除了函数之外,还有很多命名转换机制,此处我仅给出一个简短的概述。

挂钩函数

受够了语义这部分,让我们接触点有趣的东西!比方说我们有一个像这样的类:

class Shape {
    var numberOfSides: Int;

    init(){
        numberOfSides = 5;
    }
}

比如我们想将numberOfSides的值改为4,很多种方法可以做到。我们可以使用MobileSubstrate挂到getter方法中,然后更改返回值,就像这样:

int (*numberOfSides)(id self);

MSHook(int, numberOfSides, id self){
    return 4;
}

%ctor{
    numberOfSides = (int (*)(id self)) dlsym(RTLD_DEFAULT, "_TFC9swifttest5Shapeg13numberOfSidesSi");
    MSHookFunction(numberOfSides, MSHake(numberOfSides));
}

如果我们创建了一个形状的实例,并且打印出numberOfSides的值,我们得到了4!看起来不错,对不?现在,我知道你可能在想,难道我们不应该是返回一个对象而非常量4吗?

Well, in Swift, a lot of the builtin types are literals. An Int, for example, is the same as an int in C (although it could be a long – don’t hold me to that). A little note, the String type is a little bit odd; it’s a little-endian UTF-16 string, so no C literals can be used.

Let’s do the same thing, but this time, we’ll hook the setter instead of the getter.

void (*setNumberOfSides)(int newNumber, id self);

MSHook(void, setNumberOfSides, int newNumber, id self){
    _setNumberOfSides(4, self);
}

%ctor {
    setNumberOfSides = (void (*)(int newNumber, id self)) dlsym(RTLD_DEFAULT, "_TFC9swifttest5Shapes13numberOfSidesSi");
    MSHookFunction(setNumberOfSides, MSHake(setNumberOfSides));
}

Try it again and….it’s still 5. What is happening, you ask? Well, in certain places in Swift, functions are inlined. The class constructor is one of these places. It directly sets the numberOfSides ivar. So, the setter will only be called if the number is set again from the top level code. Call it from there and, what do you know, we get 4.

好吧,在Swift里,许多内建类型是书面量来的。一个Int型, 举个例子,和C语言里面的int型一样(尽管它可以是一个长整形——不要让我碰到这种情况)。一个小小的提示,String 类型有点古老,这是一个低位优先的UTF-16字符串,所以没有C的字面量能用。

让我们来做同样的事情,但这一次,我们不是在获取器上,而是在获取器上设钩。

void (*setNumberOfSides)(int newNumber, id self);

MSHook(void, setNumberOfSides, int newNumber, id self){
    _setNumberOfSides(4, self);
}

%ctor {
    setNumberOfSides = (void (*)(int newNumber, id self)) dlsym(RTLD_DEFAULT, "_TFC9swifttest5Shapes13numberOfSidesSi");
    MSHookFunction(setNumberOfSides, MSHake(setNumberOfSides));
}

再尝试一下,然后。。。。。。还是5。怎么回事,你问?好吧,在Swift里某些地方,函数是内联化的。类构造器就是其中一个的地方。它直接设置numberOfSides 为ivar, 设置器将仅在数值再次被顶层代码设置的时候被调用。在那被调用,你知道么,我们得到了4。

Finally, let’s change numberOfSides by directly setting the ivar.

void (*setNumberOfSides)(int newNumber, id self);

MSHook(void, setNumberOfSides, int newNumber, id self){
    MSHookIvar<int>(self, "numberOfSides") = 4;
}

%ctor {
    setNumberOfSides = (void (*)(int newNumber, id self)) dlsym(RTLD_DEFAULT, "_TFC9swifttest5Shapes13numberOfSidesSi");
    MSHookFunction(setNumberOfSides, MSHake(setNumberOfSides));
}

This works. It’s not recommended, but it works.

That’s all I have to write about for now. There’s quite a few other things that I’m looking at, including witness tables, but I don’t know enough about them to write. A lot of things in this post are subject to change. They’re just what I’ve reverse engineered so far by looking at the runtime and binaries compiled with Swift.

What I’ve found here is very good. It means that MobileSubstrate will not die along with Objective-C, and tweaks can still be made! I wonder what the future has in store for the jailbreaking scene… maybe Logos could be updated to automatically mangle names? Or even a library that deals with common Swift types…

If you find out more about how Swift works, don’t hesitate to let me know!

最终,让我们通过直接设置实例变量的值来修改numberOfSides。

void (*setNumberOfSides)(int newNumber, id self);

MSHook(void, setNumberOfSides, int newNumber, id self){
    MSHookIvar<int>(self, "numberOfSides") = 4;
}

%ctor {
    setNumberOfSides = (void (*)(int newNumber, id self)) dlsym(RTLD_DEFAULT, "_TFC9swifttest5Shapes13numberOfSidesSi");
    MSHookFunction(setNumberOfSides, MSHake(setNumberOfSides));
}

这个函数是可以实现功能的,虽然不建议这样做,但是确实有效果。

这是目前我所要写的内容。当然,还有很多其他的内容我正在看,包括witness表,由于我了解不多,所以这里我也没办法写出总结。很多内容在这篇文章里有变更,他们仅是我目前对运行和查看用Swift语言编译的二进制文件逆向工程操作所得到的东西。

我所发现的东西应该是非常不错的,这意味着MobileSubstrate不会随着Objective-C一同消亡,并且,微调仍然可以进行!我很想知道将来在越狱场景下的应用商店中将会是怎样一番情景……难道logo可以更新用来自动销毁命名?甚至是处理常见的 Swift 类型的库……

如果你发现更多的关于Swift如何工作的东西,不要犹豫,请让我知道!

返回顶部
顶部