Clean dynamic font API in Swift

There are generally a couple of ways to get custom fonts loaded into your apps.

  1. Include them in your bundle and load them at launch time via info.plist
  2. Dynamically load fonts after launch, even from a remote url

Loading many fonts via your info.plist is generally impractical as it has a detrimental effect on your apps launch time. However it removes the need for any code to load the fonts into memory and guarantees they exist well before you need to access them. As an added bonus, fonts loaded this way can be defined in XIB’s and Storyboard’s.

Dynamically loading fonts has the advantage of being able to load the fonts on-demand and you can even download them from a remote url. This approach also doesn’t require you to touch your info.plist.

I recently worked on an app that contained around 15-20 fonts, most of which were not required during the early stages of the apps lifecycle. So I opted for dynamically loading the fonts on-demand.

For the purposes of this article, I’d like to focus on a clean API pattern that I’ve used across various apps.

API Implementation

When dynamically loading fonts, we generally need 3 pieces of information. The font’s name, filename and extension.

public protocol FontCacheDescriptor: Codable {
    var fontName: String { get }
    var fileName: String { get }
    var fileExtension: String { get }
}

Now we have a type that describes our custom font. In most cases on iOS, we deal with TrueType fonts, so let’s define a default implementation for that.

public extension FontCacheDescriptor {
    public var fileExtension: String {
        return "ttf"
    }
}

The approach I’m going to suggest makes use of enum types. With that knowledge in hand, lets add a another extension to make use of the rawValue when our enum is RawRepresentable.

public extension FontCacheDescriptor 
   where Self: RawRepresentable, Self.RawValue == String {
      public var fontName: String {
          return rawValue
      }

      public var fileName: String {
          return rawValue
      }
}

Here’s comes the meat of this API. In order to load our custom font we need to perform the following tasks.

  1. Register the font with the system and load it into memory (only if its not already cached)
  2. If we’re targeting iOS 11+, scale the font’s size based on the current UIContentSizeCategoy
  3. Make a descriptor for the font

So lets add a convenience initializer to UIFont.

extension UIFont {

    public convenience init(descriptor: FontCacheDescriptor, size: CGFloat) {
        FontCache.cacheIfNeeded(named: descriptor.fileName, fileExtension: descriptor.fileExtension)
        let size = UIFontMetrics.default.scaledValue(for: size)
        let descriptor = UIFontDescriptor(name: descriptor.fontName, size: size)
        self.init(descriptor: descriptor, size: 0)
    }

}

API Usage

With all the code in place, we can now easily create clean APIs around our custom fonts for use in our UI code.

Lets say we have a font called Graphik and it has 4 variants.

extension UIFont {
    
    // The `rawValue` MUST match the filename (without extension)
    public enum Graphik: String, FontCacheDescriptor {
        case regular = "GraphikAltWeb-Regular"
        case medium = "GraphikAltWeb-Medium"
        case regularItalic = "GraphikAltWeb-RegularItalic"
        case mediumItalic = "GraphikAltWeb-MediumItalic"
    }
    
    /// Makes a new font with the specified variant, size
    public convenience init(graphik: Graphik, size: CGFloat) {
        self.init(descriptor: graphik, size: size)
    }
    
}

Now our UI code can simply create an instance of this font.

let font = UIFont(graphik: .regular, size: 16)

Conclusion

This API provides a clean and simple approach that provides various benefits:

  1. Typed font names
  2. Dynamic type support
  3. Dynamic font loading
  4. Font caching

Code Review with Git Worktree

I’ve been using a little known feature for some time now and thought it might be worthy of a post for those (like myself) who prefer the Console world of GIT.

Scenario

Para-phrasing from the Git Documentation:

You are in the middle of a refactoring session, unable to commit and you need to checkout another branch to code review, reference or something else. You might typically use git-stash to store your changes away temporarily, however, your working tree is in such a state of disarray that you don’t want to risk disturbing any of it. Use a Git Worktree.

I think we can all agree that this is an all too common scenerio?

Worktree’s are also extremely useful during Code Review. You don’t want to commit your changes, but need to checkout someone elses branch to ensure the work they’ve done actually works. Or perhaps you need to checkout the code to better understand how it works.

Worktree’s

In simple terms, a worktree represents a linked clone of your master worktree. This means its extremely fast to work with and because its generally transient/temporary, you can simply delete it once you’re done.

Workflow

So lets take a look at a typical workflow.

# git worktree add $PATH $BRANCH
# Adds a new worktree at 'path' and performs a checkout of 'branch'
# path:   prs/5.2.0
# branch: release/5.2.0

git worktree add prs/5.2.0 release/5.2.0

Hack away

Simply `cd` into the directory and hack away. The sub-folder is a complete checkout of the branch you specified and has its own `.git` folder. So any changes you make here are affected only on that branch – leaving your original branch untouched.

cd prs/5.2.0``open .

Push and Prune

If you made changes to your worktree, simply push those changes back up to your origin as usual. Then you can remove the sub-folder and prune your worktree from Git.

git push # optional
cd ../..
rm -rf prs # you can remove just the 5.2.0 folder if you have multiple worktree's
git worktree prune

List

Its important to remove your Worktree when done, because Git only allows 1 worktree per branch. If you need to remember where you created your Worktree, you can list them using:

git worktree list

Conclusion

Worktree’s are a great tool to have in your bag. Using this approach I found I’m more likely to checkout someone’s branch during Code Review to ensure it works.

Protocols & Mutability in Swift

Protocol Oriented Programming (POP) has become synonymous with the Swift programming language. Its an inherently better fit for Composition and generally leads to better architecture and code reuse. However its often difficult to deal with Mutability. 

Objective-C

In the Objective-C world, common practice (perhaps learned from Apple’s SDK) was to create a Mutable subclass that provided read/write properties. This was extremely cumbersome and required a lot of maintenance. I think in Swift we can do better.

Example Problem

Lets say want a type to represent a control in our UI. This type should be immutable and has a single property defining its current state.

public protocol Control { 
    var state: State { get }
}

public final class Button: Control {
    public let state: State
}

Now lets define some API that will return an instance of our Control.

public func button(from json: JSONDictionary) -> Button {
    var control: Control = makeControl(from json: JSONDictionary)
    control.state = .default
    return control
}

The code above will throw an error at compile time, because state is read-only. Now we could obviously make it { get set } but that would allow our API consumers to modify the variable too.

Solution

Instead we can introduce a Mutable variant of our protocol. 

public protocol Control { 
    var state: State { get }
}

internal protocol MutableControl: Control {
    var state: State { get set }
}

public final class Button: MutableControl {
    public internal(set) var state: State
}

Notice that our Mutable variant is marked as Internal. This is the secret ingredient to making this work.

Now we can simply update our API to return our MutableControl without needing to expose internal features.

public func button(from json: JSONDictionary) -> Button {
    var control: MutableControl = makeControl(from json: JSONDictionary)
    control.state = .default
    return control
}

Summary

This is really nice feature and approach to solve immutability when using Swift Protocols. If you liked this post or want to get in touch, I’d love to hear from you.

A Lexer in Swift

This is a multi-post series covering the definition, example code and basic error handling/recovery of a Lexer, Parser and AST written in Swift (although concepts can be applied to any language).

These posts are targeted at the beginner–as such I will only cover the basics for getting started. This post will focus entirely on the Lexer.

I’ll also include a Swift Playground at the bottom of each post if you want to try it out for yourself. *Requires Xcode 9 (Swift 4).

Why build a Lexer?

I wanted to write a Lexer/Parser to improve my skillset. I wanted to better understand the definitions of the various components, the responsibilities of each and more importantly why I’d want to write one in the first place.

I had trouble finding simple examples that just went over the basics, so that’s what I’ve tried to do here. Hopefully you’ll find this useful as well.

What I’ll cover

This post is for beginners, I’ll discuss the various components of a Lexer as well as its responsibilities. I’ll also briefly discuss performance and error handling.

As I mentioned above this article is targeted at beginners, so I’m going  to demonstrate building a Lexer for parsing arithmetic operations.

Definitions

Code Point
A code point defines a single unicode scalar value. This is a complex subject so for the purposes of this article, you can consider a code point equivalent to a character.

Throughout this article I will use the terms Character and Code Point interchangeably.

Token
A token is essentially a component of our grammar. A token generally represents 1 or more code points.

Lets see how the equation will be tokenised.


tokens.jpg

The responsibility of a Lexer is to generate a tokenized representation of the source string. 

Characters vs Code Points

Typically when we talk about the components of a String, we think of it being made up of Characters. However there is another level of detail here. Characters can be made up of 1 or more Code Points which are essentially UnicodeScalar values.

For the purposes of this article however these two types are equal – as such I will use the term Characters however all example code will work on UnicodeScalar‘s. 

Tokens

A Lexer is responsible for breaking a string into smaller components, called Token’s. Lets look at a simple example:

5 + 23 * 3 = 74

Lets start by defining the components of this expression. For the purposes of this article, we’ll only consider Integer values and 4 simple arithmetic operators only.

  1. Digits (0-9)
  2. Operator (+, -, *, /)
  3. Equality (=)
  4. Space

An arithmetic equation contains 2 expressions, left and right separated by an equality character. For the purposes of our Lexer, this isn’t really relevant however. We simply care about the ‘components’ that make up our equation. 

Omissions

To keep things simple I’ve omitted a lot of details for truly parsing any function, i.e. parentheses, negative/floating-point/exponential numbers, etc.. I’ll leave those as an exercise for the reader.

Whitespace

If you plan to do any form of Linting you’ll require whitespace tokens, however if you plan on writing a compiler to validate the equation then whitespace is obviously not important.

Tokenizer

Now that we’ve defined our possible tokens, lets define our Lexer class that will consume them.

public final class Lexer {
    func tokenize(source: String) -> [TokenType] {
        var tokens: [TokenType] = []
        let chars = source.unicodeScalars 

        while let token = chars.consumeToken() {
            tokens.append(token)
        }

        return tokens
    }
}

Notice the helper function: chars.consumeToken()
Lets define that now.

extension UnicodeScalarView {
    mutating func consumeToken() -> Token? {
        return consumeSpace()
            ?? consumeDigit()
            ?? consumeOperator()
            ?? consumeEquals()
    }
}

Remember each consumer is responsible for returning either a single Token or nil. We are considered done when no more tokens are returned. i.e all consumers return nil.

Space Consumer

For the purposes of this post I’ll demonstrate the first consumer, however you can checkout the Playground for the full implementation.

The following represents a simple white-space consumer.

extension UnicodeScalarView {
    mutating func consumeSpace() -> Token? {
        return consumeCharacters(where: {
            $0 == UnicodeScalar(" ")
        }).map { .space($0.characters.count) }
    }
}

Basically this function will attempt to consume a space character. If its successful, it will return a token with the number of spaces found. Otherwise it will return nil and the next consumer will be run.

Result

So lets run our equation through our Tokeniser:

let source = "5 + 23 * 3 = 74"
let tokenizer = Tokenizer()
let tokens = tokenizer.parse(source)
dump(tokens)

Which will output the following:

▿ .number: "5"
▿ .space 1
▿ .operator: "+"
▿ .space: 1
▿ .number: "23"
▿ .space: 1
▿ .operator: "*"
▿ .space: 1
▿ .number: "3"
▿ .space: 1
▿ .operator: "="
▿ .space 1
▿ .number: "74"

So there we have it, a simple Lexer for tokenizing an equation from a string.

Error Handling

There are several types of failures that can occur.

Parse Error

The first is a parse error, where we couldn’t reach the end of the file but came across a character we couldn’t parse. In the tokeniser above we could simply fail and return a generic error, perhaps with the last position where the failure occurred.

In my own tokeniser I found that I could simply recover. The best approach I found was to change my loop to iterate while !eof (end-of-file). When I came across a character that couldn’t be parsed, I would simply consume it into an .error(Range<String.Index>) token and attempt to continue.

Unexpected Token

Using our example equation, an unexpected token would be something like sequential operators with no digits in-between.

4 + / 5 = 9

This could also be considered a parse error, however the difference here is that out Tokeniser won’t fail because we can parse that into tokens. This is where our Parser comes into play, but we’ll cover that in the next post.

Performance

When I initially wrote the Lexer I kept it simple (similar to the example code in this post) and made all of my tokens an enum with an associated String value.

We’re working with UnicodeScalarView – which is basically just a cursor into the string, so walking through the characters has almost zero-cost. However every time I parsed a character, I’d make a copy into a String and store it in a token. 

A better approach is to consider your token data and store it more appropriately. The following would be much more performant:

enum Token {
    case digit(Int)
    case space(Int) // number of spaces
    case addition
    case subtraction
    case multiplication
    case division
    case equals
}

Our simple equation Lexer probably wouldn’t see much benefit from this considering the small amount of characters we would be parsing, even in extreme cases. However when you’re parsing thousands of lines or more, these kinds of refactors can be hugely beneficial.

For reference, my Tokenizer went from 40ms to 16ms for a 1000 line file by making the simple change suggested above.

Conclusion

Writing a lexer can be a daunting and time-consuming task however there’s a lot to be gained by doing so. 

  • Improve understanding of Lexers/Parsers and in-part Compilers
  • Improve your understanding of Unicode constructs
  • Improve your performance tuning skills – you’ll need it!

If you have any questions or just want to chat, you can find me on Twitter @shaps.