An intro to ARKit framework

12 November 2017

ARKit framework was introduced with the release of the latest iOS (11) and it's available for development with XCode 9 (Since the iOS 11 SDK requires it). Before getting excited to build your AR apps on your phone, here's a cold water bucket that you might face: The features of AR are only available for A9 chips or earlier versions (A10, A11). Said that your device needs to be SE, 6S or further — Not 6.

Before we get some hands-on, I would like to specify some key points of the framework, such as the magic VIO (Visual Inertial Odometry). This buddy is what makes the apps built with ARKit so stable and accurate. Basically, it's a combination of another powerful framework called Core Motion and some data from the device camera. Using these 2, you will have a detailed tracking of your position + feature point tracking that will make the experience much better than the options we've been using before.

The last point I would like to mention about using this framework is that you go for an experience completely written in Swift. For those who never tried AR/VR on iOS before — it was complicated. We had to use Unity for example, or OpenCV and these tools can make you have a hard time if you're not willing to spend time with other languages. To use OpenCV on iOS, for example, you would have to use some sort of extension of Objective-C that could compile C++ code. It's called Objective-C++. (Catchy, right?) Unity, on the other hand, would export your code in Objective-C++ but you wouldn't have to code it straight in this syntax because you would develop your Unity features with the Unity IDE in C# or JS and once you build the app to iOS, it would auto-generate the Obj-C++ code for you — believe me, not the most pleasant experience.

TL;DR show me what you got.

To kick-off, let's setup our environment. Create a simple swift project, go for the Info.plist and add this key:

Privacy — Camera Usage Description

With some value to it — doesn't matter which it is because at the end it will be just a message of the alert view to allow using the camera.

Done that, go for the ViewController at the storyboard (or not, if you are instantiating xibs) and drag an ARSCNView to it. Set the constraints and connect an outlet to it. So far, piece of cake right?

After this tasks we just need to do a couple more lines of code:

With this configuration, we are able to track the phone position related to the real world at all time.

That's it. More breeze, the debug options were not even needed, I just put there for (as the name says) debug purposes. Functionally, it adds nothing. Now, if you run the app on your device, you might be able to see first the camera permission alert view and once it starts the session, your phone is being tracked on the VIO we talked about, remember? Its drawing your route like in this gif:


With this debug options set, if you walk back one or two steps from where you started the camera session, you might be able to see a colored representation of the 3D-axis of the scene. You can see the other debug option once you point your camera to a surface and there will be some yellow dots on it. (Tip: the more regular the surface, the less will be the number of feature points detected). Feature points tracking work upon irregularity, corners basically, so if you target an empty whiteboard it will make things hard for it. Go for a brick-wall and you will see what I'm talking about.

Now we're getting somewhere…

So now you are able to see the virtual world, but it's empty. Let's put something there so it can differ from the real world.

To place geometries, lighting and others we need a SCNNode (or a bunch of them). The definition that Apple gives us is a bit abstract. I would say that is simple enough to understand but not that simple to explain, so I decided to cmd-c+cmd v:

A node itself is invisible. Unless we put a geometry there with some diffuse texture/color on it. So consider the chain SCNNode > SCNGeometry > Properties such as position and texture. Check the following snippet to understand it better:

Also no big deal. The rootNode is your WorldOrigin (where the 3D-axis from the debugOption is, so once you added a node with that position 0.5, 0.5, 0.5 it will be 0.5m far from each of the axis at the origin. (Yep, note that the measuring unit is meters so unless you're working on a big office keep things as close as 1/2 meter more or less). Diffuse refers to the color or texture of the geometry at ambient light afaik. There are also another types of lighting effects you can add, such as specular, which is the color/texture the object has when a source of light is pointed directed to it or normal and others… You can see these and much more at the Interface Builder of a .scnasset. We might get there in some other article.


Using ARKit is very easy and you can create a really nice experience with just few lines of code. Each geometry has its own properties, some of them are common between themselves, some are not, such as radius or chamferRadius (which is kind of like cornerRadius of 2D views from CALayer). Check Apple’s documentation if you are interested in which geometries are available on SceneKit. Thank you for your reading and leave a comment/clap if you liked it! Cheers.

An intro to ARKit framework was originally published in Supercharge's Digital Product Development Guide on Medium, where people are continuing the conversation by highlighting and responding to this story.