Exploring Swift using GPUImage

June 30, 2014

Author: Brad Larson

I always find it more effective to learn new programming concepts by building projects using them, so I decided to do the same for Apple's new Swift language. I also wanted to see how well it would interact with my open source GPUImage framework. As a result, I made GPUImage fully Swift-compatible and I've built and committed to the GitHub repository a couple of Swift sample applications. I wanted to write down some of the things that I learned when building these.

Getting an Objective-C framework into Swift

Swift is designed to interoperate with Objective-C code, and does so both by allowing you to bridge to Objective-C code in your current project and by pulling in Objective-C modules. Modules are extensions of the traditional framework bundle we're used to on the Mac, but they add some new capabilities for making these frameworks easier to work with.

In Xcode 5, we only were able to use modules for system frameworks, but Xcode 6 adds support for building your own for third-party frameworks. Swift requires that Objective-C frameworks be modules, so I had to make GPUImage build as one.

On the Mac, where I was already building a full framework bundle, this was trivial and only involved checking an additional build setting. On iOS, I was building the framework as a static library, so I had to do a little work to add a true framework target and make that build as a module named "GPUImage". The difficulty with that is that I already had a static library target named GPUImage, and Xcode didn't seem to like building a slightly differently named target that still built something called "GPUImage". I had to hand-edit the module's mapping file in order to get this to work, but all of that is now incorporated into the GPUImage GitHub repository.

See the "Adding this as a framework (module) to your Mac or iOS project" section of the GPUImage README.md file for instructions on how to pull this into your Swift project. The big thing that I discovered in this is that iOS frameworks are indeed supported back to iOS 7 (I'd read otherwise). You just need to make sure that you create a Copy Files build phase and choose Frameworks from the pulldown within it. Then add the GPUImage framework to that phase to make sure it is packaged within your application bundle.

One additional thing I was able to do was to use GPUImage within a Swift playground. I posted this on Twitter, but despite there not being an official way yet to include third-party code in a Swift playground, I found that if you built a release version of a module (framework), targeted it at the iOS Simulator, and copied into the SDK frameworks directory within Xcode:

Xcode.app/Contents/Developer/Platforms/iPhoneSimulator.platform/Developer/SDKs/iPhoneSimulator8.0.sdk/System/Library/Frameworks

Swift playgrounds will then let you @import that module as if it was a system framework. This is excellent for rapid prototyping of code using that framework, and I've already been using it to test out specific capabilities.

A tiny starting application

The first step when working with GPUImage in Swift was to create a simple test application. For the fun of it, I wanted to see how small I could make this and still have it be functional in Swift.

This is the full code for a Swift application that takes in live camera video from the rear camera, processes it on the GPU, and displays the realtime results to the screen:

AppDelegate.swift:

import UIKit
 
@UIApplicationMain
class AppDelegate: UIResponder, UIApplicationDelegate {
 
    var window: UIWindow?
 
    func application(application: UIApplication, didFinishLaunchingWithOptions launchOptions: NSDictionary?) -> Bool {
        return true
    }
}

ViewController.swift:

import UIKit
import GPUImage
 
class ViewController: UIViewController {
 
    var videoCamera:GPUImageVideoCamera?
    var filter:GPUImagePixellateFilter?
 
    override func viewDidLoad() {
        super.viewDidLoad()
 
        videoCamera = GPUImageVideoCamera(sessionPreset: AVCaptureSessionPreset640x480, cameraPosition: .Back)
        videoCamera!.outputImageOrientation = .Portrait;
        filter = GPUImagePixellateFilter()
        videoCamera?.addTarget(filter)
        filter?.addTarget(self.view as GPUImageView)
        videoCamera?.startCameraCapture()
    }
}

That's it. Only 23 lines of non-whitespace code total for a complete GPUImage application. Nice.

Leveraging Swift for cleaner code

I next wanted to see if I could use some new Swift language features to clean up my most elaborate GPUImage sample application, the FilterShowcase which runs through each of the ~170 filters and operations in the framework. In particular, this seemed like a great use of the new-style enums and their associated values.

Filters and other operations in GPUImage come in various types, with varying input requirements and means of modifying their properties. Right now, I handle this with a traditional numerical enum and switch statements in four places within the FilterShowcase example: one switch for the label to provide in the list of filters, another for setting up the filter, one for providing extra custom setup to certain filters, and a fourth for taking the input from an onscreen slider and translating that to changes in filter properties.

I wanted to see if I could make it so that you could provide all the information required for each filter case in one compact form, and only need to add or change elements in a single file to add a new filter or to tweak an existing one. Associated values for enums, combined with closures, sounded like just the way to do this, so I came up with the following Swift types:

enum FilterSliderSetting {
    case Disabled
    case Enabled(minimumValue:Float, initialValue:Float, maximumValue:Float, sliderUpdateCallback:((filter:GPUImageOutput, sliderValue:Float) -> ())?)
}
 
enum FilterOperationType {
    case SingleInput(filter:GPUImageOutput)    
    case Blend(filter:GPUImageOutput, blendImage:UIImage)
    case Custom(setupFunction:(camera:GPUImageVideoCamera, outputView:GPUImageView, blendImage:UIImage?) -> (filter:GPUImageOutput))
}
 
class FilterOperation {
    let listName: String
    let titleName: String
    let sliderConfiguration: FilterSliderSetting
    let filterOperationType: FilterOperationType
 
    init(listName: String, titleName: String, sliderConfiguration: FilterSliderSetting,  filterOperationType: FilterOperationType) {
        self.listName = listName
        self.titleName = titleName
        self.sliderConfiguration = sliderConfiguration
        self.filterOperationType = filterOperationType
    }
}

For a filter that doesn't have any properties to update, all you'd need is the .Disabled setting for the FilterSliderSetting enum. For ones that do, you provide the initial, minimum, and maximum values for the slider to be displayed, as well as a closure that does the updating of the filter whenever the slider moves. I really like the ability to combine this in one logically organized place using the associated values for an enum.

Likewise, there are three main categories of filters: ones that only have a single input and display an image, ones that take in a video feed and blend it with another image, and ones that require completely custom setup of the filter and supporting operations. All three can be represented using the FilterOperationType enum, with varying parameters for the different cases, including a closure for completely custom setup.

This lets me then create a single array of FilterOperations using a terse yet descriptive format:

let filterOperations: Array<FilterOperation> = [
	FilterOperation(
	    listName:"Sepia tone",
	    titleName:"Sepia Tone",
	    sliderConfiguration:.Enabled(minimumValue:0.0, initialValue:1.0, maximumValue:1.0, {(filter:GPUImageOutput, sliderValue:Float) in
	        (filter as GPUImageSepiaFilter).intensity = CGFloat(sliderValue)
	    }),
	    filterOperationType:.SingleInput(filter:GPUImageSepiaFilter())
	),
]

and then have the view controllers extract the information they need from the appropriate operations in a generic manner, using closures to set up and update filters in response to user interaction.

Unfortunately, the above code chokes up LLVM at present, because LLVM can't currently handle enums with closures as associated values (rdar://17500139). The explicit CGFloat() cast in the above also shouldn't be necessary, but it avoids a compiler segfault (rdar://17499776). To work around this, I had to employ optional variables in the FilterOperation class for the two closures, but that will hopefully be fixed soon.

This Swift-enabled reorganization of code is a huge win, making it so I don't have to change code in four places to add a new filter or change an existing one, as well as reducing this from 14 lines of code per filter to 8 (10 with the temporary closure workaround described above). With ~170 filters, that adds up. I love being able to work in only one file and to ignore the rest of the application going forward.

I recognize that I could mostly bring these structural improvements back to my Objective-C example (and I probably will), but the only reason I thought to do this was the mindset that Swift puts you into. It does seem to help you think of new ways of organizing your code.

Again, both of these sample applications are now live in the GPUImage GitHub repository if you want to pull them down and tinker with them.

Update: 7/11/2014

Further improvements using generics

I wasn't completely happy with the need to specify the type of filter class twice in the above, nor the explicit cast in the slider callback, so I decided to experiment with generics. Generics allow you to create type-specific classes and functions from a more general base, while preserving types throughout. That seemed ideal for the slider update callback, in particular, to make it specific to the class in question.

After using generics for this, the filter operation setup code now looks like the following:

let filterOperations: Array<FilterOperationInterface> = [
    FilterOperation <GPUImageSepiaFilter>(
        listName:"Sepia tone",
        titleName:"Sepia Tone",
        sliderConfiguration:.Enabled(minimumValue:0.0, initialValue:1.0, maximumValue:1.0),
        sliderUpdateCallback: {(filter, sliderValue) in
            filter.intensity = CGFloat(sliderValue)
        },
        filterOperationType:.SingleInput,
        customFilterSetupFunction: nil
    ),
]

To my eyes, this is easier to read, in that the class the showcase example is built around is prominently featured at the top of the FilterOperation initialization block of code, rather than later in the initializer parameters. You'll note that the slider callback closure is a lot cleaner, since I no longer have to do a forced cast and the compiler can check to make sure the property I'm setting actually exists on that specific filter class.

This does break out the closures for the slider update and filter setup as optionals, due to the previously mentioned compiler problems with using those as enum associated values, but once the compiler supports that I'll be able to remove those portions.

I did have to add a little more setup code in the operation type definition to support this, mostly around creating a protocol that these operations comply to. Since each instance of a generic will be specific to the class you associate with it, you can't really have an umbrella type that contains all these variants. A protocol is a way to provide some kind of an overall interface to all of these individualized variants, so you'll notice I use the FilterOperationInterface protocol as the type for the items in the array now. I also use this for properties that take in these filter operation description classes.

The generic version of the FilterOperation class now looks like:

class FilterOperation<FilterClass: GPUImageOutput where FilterClass: GPUImageInput>: FilterOperationInterface {
    var internalFilter: FilterClass?
    let listName: String
    let titleName: String
    let sliderConfiguration: FilterSliderSetting
    let filterOperationType: FilterOperationType
    let sliderUpdateCallback: ((filter:FilterClass, sliderValue:Float) -> ())?
    let customFilterSetupFunction: ((camera:GPUImageVideoCamera, outputView:GPUImageView, blendImage:UIImage?) -> (filter:GPUImageOutput))?
 
    init(listName: String, titleName: String, sliderConfiguration: FilterSliderSetting, sliderUpdateCallback:((filter:FilterClass, sliderValue:Float) -> ())?, filterOperationType: FilterOperationType, customFilterSetupFunction:((camera:GPUImageVideoCamera, outputView:GPUImageView, blendImage:UIImage?) -> (filter:GPUImageOutput))?) {
        self.listName = listName
        self.titleName = titleName
        self.sliderConfiguration = sliderConfiguration
        self.filterOperationType = filterOperationType
        self.sliderUpdateCallback = sliderUpdateCallback
        self.customFilterSetupFunction = customFilterSetupFunction
        switch (filterOperationType) {
            case .Custom:
                break
            default:
                self.internalFilter = FilterClass()
        }
    }
 
    var filter: GPUImageOutput {
        return internalFilter!
    }
 
    func configureCustomFilter(filter:GPUImageOutput) {
        self.internalFilter = (filter as FilterClass)
    }
 
    func updateBasedOnSliderValue(sliderValue:Float) {
        if let updateFunction = sliderUpdateCallback
        {
            updateFunction(filter:internalFilter!, sliderValue:sliderValue)
        }
    }
}

I'm still experimenting with this, but I like the way the new generic version of the filter operation definition works for specifying each filter type in the overall showcase. It's type safe, easy to read, and lacks redundant code.

Problems with multiline strings and GPUImageOutput

Swift is an evolving language, though, and there are a few areas that I had problems with. Aside from the above-mentioned compiler troubles, I couldn't find a way to replicate two common pieces of functionality I use.

The first is multiline string constants. As I described on Stack Overflow, I have a little helper macro to let me inline strings for shader programs within GPUImage. Swift does away with macros, but doesn't really have a solution for multiline string literals (rdar://17421963). That's preventing me from doing live shader prototyping in Swift playgrounds, which I'd love to do.

The second issue is that Swift doesn't have a one-to-one representation of variables or methods that take in a certain class, but only objects of that class that comply with a specific protocol. In Objective-C, you can do this with Class (like GPUImageOutput, which I use throughout GPUImage), but Swift doesn't let you specify this directly. Swift auto-generated headers convert something like GPUImageOutput into just GPUImageOutput, which is wrong and will potentially lead to non-compliant objects being passed into Objective-C classes and methods and could trigger runtime crashes (rdar://17500629).

I asked about this on Stack Overflow, and while the suggestions to use generics and their where clause were interesting, that's a compile-time type restriction, not a runtime one. For example, this wouldn't seem to allow for GPUImageOutput properties that could be set to various class types at runtime, as long as they all complied to those two requirements. Additionally, using a generic type like this on something like a UIViewController that has IBOutlets throws a compiler error due to its inability to generate Objective-C code to satisfy that case.

For now, I'm going with GPUImageOutput inputs and then doing conditional casts using (object as? GPUImageInput) within the appropriate places, but I'd like to see a better way of handling this.

Overall, I'm having a lot of fun exploring Swift and the patterns it encourages. I particularly like how it eliminates redundant or noise code and streamlines the process of building a working application. I bet there are ways to improve some of what I've done so far using generics and the like, but I'm still figuring out how best to deal with the new language features.