Wednesday, April 16, 2014

5 Faces of Dependency Injection in Clojure

Dilemma:

Working together
How, in a functional way, to access resources like database and other external dependencies.  Most non-trivial applications will have one or more external resources that need to be acquired in order to work. Things like databases and web services will fall into this category.  The problem is they need to be shared across much of the code base but they are not truly static.  The database password will need to change.  Different test and production environments will need to point to different servers.

How to write the code in a way that allows for the database information to be different each time the code is run?  If this were Java, the common answer is Dependency Injection.  Using the Spring Framework, dependencies are injected into the objects at run time.  This allows for the configuration to happen outside the code and still let the objects have access to it.

Solution:

Thanks to Stuart Sierra for his simple and elegant context idea as demonstrated in his Clojure in the Large presentation as well as his Component library.  Using the techniques he describes, it is possible to separate the configuration of the resources from the code that needs them. This addresses the first part of the problem.  The second part remains namely: how to give the code access to the resources.

Thus leading to the title: Dependency Injection in Clojure.  For object oriented languages like Java, Dependency Injection is a powerful technique and widely used.  Of course, Dependency Injection techniques are not meaningful in a functional environment.  It requires objects and specifically mutable objects to really work.  The objects are built and then mutated by having their dependencies injected.

However, the same problem exists in Clojure.  Rather than giving objects access to the context, the Clojure problem is giving functions access to the context.  For this article, Dependency Injection will have a looser definition meaning "giving functions access to the context".

As I have thought about it and was able to discuss it with Ben and Levi at Lambda Lounge Utah, there are at least 5 ways of doing Dependency Injection in Clojure.  That is, 5 ways of giving functions access to a shared context.  Rather that present them in order based on obviousness, they will be presented starting with the most naive.

Globally Shared

This is often the first way developers think to share data across an application: simply throw it in a def in a namespace and allow any function that needs it to reference it from there.  


It has the advantage of being simple to implement.  The disadvantages are numerous and Dependency Injection was originally developed to overcome the shortcoming of globally shared data.  Among other things, putting the context in a globally shared data structure will make testing and moving across environment much more cumbersome.

Binding

The next thing people will often try is using the Clojure binding form

This seems to overcome the global shared problem as now a function can access its copy of the form.  However, Be Mindful of Clojure's binding.  This is an excellent article which describes some of the perils of using the binding form.  It specifically mentions crossing thread boundaries and problem involving lazy sequences.  While your context most likely won't be lazy, it will almost certainly be crossing thread boundaries making the use of binding possibly troublesome.

Both the previous examples use dynamic scoping which is the underlying trouble.  It would be preferable to using lexical scoping to access the context.  The remaining 3 examples will use lexical scoping.

Function Argument

The first example of lexical scoping is simply passing the context into each function that wants it as an argument.

There is a lot to say for this method and will be the first choice of dependency injection in Clojure.  In a majority of cases, this method will serve nicely and makes for nice clean code.  However, there is at least one drawback that a simple example does not reveal.  The context ends up getting passed to functions that do not use it directly, but just pass it on to functions which they in turn call.  If you think of a large project, there may be hundreds of functions all passing around the context just so it can make it way to the bottom layer of code which interacts with the resources.  It might be my OO upbringing but this feels like exposing implementation details just as leaky abstraction does in Java.

Closing over the Context

Another method would be closing over the context using let and letfn.

In the example the context is passed into a function which then closes over it.  The foo function is no longer visible outside the closure and can only be called by the app function.  While one could write the entire application in such a fashion, it is not very clear.  Where this example shines is in the case of callbacks, like in Swing.  The function can close over not only the context but the components of the application as well.  Making it quite simple to create call backs that reference not only the context but the components.

Reader Monad


Monads often get a bad wrap, but for good reason.  They are a bit hard to grasp. The examples and tutorials often describe what they are but not what to do with them.Fortunately I found a good example in the presentation "Monads in Clojure" by Leonardo Borgess.

Let me describe what the reader monad will do, but not how it does it (as I do not really care how it works).  Looking at the Closing over the Context example above, the function takes a context, creates the clojure and returns a function that will execute the functions with the symbols bound.  But, what if that we turned inside out?  Rather than closing on the context, close on what other data is needed at creation and return a function that expects the context as an argument.


This requires an explanation.  What does (domonad reader-m [val func] body) do?  Quite simply, it returns a function that accepts a single argument, the context.  When that function is executed with the context as an argument, func is executed and the result stored as val.  The body is then executed and returned.

The func function has to return the monad as well so it can be called with the same context passed in to the calling function.

In foo there is a function asks.  This function is a helper for reader monad that returns a value from the context.


What happens app function is called?  It returns a function that behaves like this

Of course, it does not really use defn but uses macros to generate the required functions and return the top level one.  When app* is executed, it then calls foo* passing in the context where it can be executed.  The nice thing is there can be any number of function calls between app and foo that know and care nothing about the context.

Of course the (domonad reader-m is hard to read and really does not really convey intent very well.  To help make it clearer, I created some simple macros to wrap the code resulting in this code.

The wrapper is at topoged/context.clj if you care to look.  The advantage is that now the context can be passed using lexical scoping without having to specifically mention it in the argument lists of functions that do not need it.  The downside is you have to know which functions return the monad and which don't.  Those that do have to be in the [val func] bindings while those that don't have to be in the body. This might end up being a pain as well.  While I think this option shows promise, that will need to be overcome to make it truly useful.

The 5 methods of dependency injection, or getting data into functions, are: globally shared, binding, function arguments, closing over the context and the reader monad.  Each has a place in the whole architecture of a Clojure app and should be in every developers toolbox.