Julia Basics: Multiple Dispatch

Creation date: 2020-06-22

From time to time I hear that my posts are a little bit too deep and complicated for beginners. I totally agree: especially in my newest series on "How to build a constraint solver from scratch?" which is probably a never ending series. I do like to blog about it and will continue but this series is different.

Julia is a relatively young language and not too many people are using it. It thrives in scientific computing but I believe that it can be used for general computing (probably where start up time is not that relevant) i.e. I do use it for creating this blog with Franklin which I blogged a bit about as well here.

This series tries to explain some of the core concepts of Julia and maybe some packages to beginners. I, the explainer of this stuff here, am by no means an expert in any of these. If you know this blog then you might know that I'm trying to explain stuff directly after I've learned them to hopefully be able to communicate better than some people who do this their entire life and are experts in it. Blogs from experts are sometimes hard to follow for me, so maybe also for you. I'll let them proof read my blog post just to be sure that there is nothing wrong with what I explain ;)

Note

All the posts in this series can be reached via the side bar.

What is dispatch?
How does dispatch work in Julia?
Wait what about function overloading in C++
Why Julia doesn't use OOP?

Note

Before we get into the content: Are you interested in getting updates on this series as well as a short email about new posts on opensourc.es ?

Free Patreon newsletter

What is dispatch?

In every programming language there are functions which have the purpose of providing structure and reusability of code. The functions have a name and some arguments. They are defined like

def add(x, y):
    return x + y

In Python this can be called with add(2,3) which gives the expected 5 but also add("2","3") gives '23' which might make sense or not depending on who you ask :D

Now how about add("2", 3) ?

This results in an error TypeError: can only concatenate str (not "int") to str which again can make sense. Well I would say it does but Javascript does something different:

function add(x,y) {
    return x+y
}

will produce "23" for that.

Then there are of course languages like C++ where this all is not that simple because we need to introduce types:

int add(int x, int y) {
  return x+y;
}

There it is clear for everyone that it takes two int and returns one int. Because it is a compiled language we would get an error directly when trying to call it with add("2", 3).

Now what is the point? We can see different concepts here with having static types as in C++ and dynamic typing in JS and Python. They both have pros and cons which is probably obvious. One is easier to reason about and the others are maybe easier to hack around with.

Let's go into dispatching on those three languages:

In Python it is: Ah okay the user wants to call add:

We have the add function
Lets call it with the arguments.
Good that works as it expects two arguments and we have two.

Then we might get to the point where the function doesn't work i.e add("2", 3) and throw an error or it works out and the answer is returned.

It is basically the same in JS. There are possibility some "extensions" like Typescript where things might happen differently. I haven't programmed with any of those languages lately so keep that in mind.

In C++ more things are going on as each variable has a static type and one can check directly whether this fits or not. This means as we can later see that there can be more functions with the same name. I'll explain the difference between function overloading and multiple dispatch in that part ;)

In all of these languages we can have classes such that we might have a class Manufacturer and we can define add(self, thing) or something like that and can call the function with manufacturer.add(thing). This can be kind of seen as single dispatch. Dispatch is more or less the process of deciding which function to call. Here it depends on the type of manufacturer. Is it a Manufacturer or a Box? For a Box we might have defined a class Box and add(self, box) inside of it.

Note

These examples are more from the Python world but hopefully convey the point.

For people coding in Julia for longer this might sound like a weird concept. Actually writing about it, I am thinking: How would I do some things, where I use multiple dispatch all the time (like in the ConstraintSolver) in one of those languages?

Before I explain multiple dispatch I want to note:

Yes there are ways in Python for single dispatching like @singledispatch but given that the main posts I found when searching are 3-4 years old I doubt that a lot are using it :D

And it is still single dispatch.

How does dispatch work in Julia?

Now that there is that out of our way let us have a look at one example in Julia:

add(x, y) = x+y

just to show a neat little way of defining one-line functions... (or as they are called in Julia: Methods)

The interesting thing is when you type this in the Julia REPL (Read-eval-print loop) you get:

julia> add(x, y) = x+y
add (generic function with 1 method)

This means that we have now one function with currently a single method attached to it.

Calling that function works basically like in Python (from the user perspective for now). It doesn't work for strings though.

Note

Julia and + for strings: Julia is a mathematical language and + is commutative whereas concatenating strings is not. So "2"+"3" is not "3"+"2". Therefore Julia decided to use * instead. Which is commutative for numbers but not matrices for example.

Okay where did I interrupt myself? ... Ah yeah so we have an add function with one method now in the REPL. That looks like we might be able to add a new one, right?

Note

As correctly pointed out on Reddit: I use mostly the word function. In Julia there is actually a difference between functions and methods. There is one + function with a lot of different implementations: called methods.

julia> add(x,y) = 2x+y
add (generic function with 1 method)

okay that did not work because it still allows all types of inputs (and throws an error later when it doesn't work) because the compiler had no way to decide which ~~function~~method to call.

Should it guess? No, it just overwrites the old method.

A small side step again: Let's check what happens when we call add("2", "3")

julia> add("a", "b")
ERROR: MethodError: no method matching *(::Int64, ::String)
Closest candidates are:
  *(::Any, ::Any, ::Any, ::Any...) at operators.jl:529
  *(::Missing, ::AbstractString) at missing.jl:174
  *(::T, ::T) where T<:Union{Int128, Int16, Int32, Int64, Int8, UInt128, UInt16, UInt32, UInt64, UInt8} at int.jl:54
  ...
Stacktrace:
 [1] add(::String, ::String) at ./REPL[3]:1
 [2] top-level scope at REPL[4]:1

That error occurs when calling 2x where it figured that 2 is an integer and x = "2" is a string. It gives us information of what kind of types it can multiply.

Let's pick one:

*(::T, ::T) where T<:Union{Int128, Int16, Int32, Int64, Int8, UInt128, UInt16, UInt32, UInt64, UInt8} at int.jl:54

This tells us that we can multiply two numbers of those types when they are the same. So UInt8 with UInt8 but I come to that syntax later.

You might wonder how many of those * ~~functions~~methods there are: 357 is the answer which you get when typing

julia> methods(*)
...

which gives you a long list with all kind of weird types where it sometimes spreads over several lines. I mean what is this? :D

[345] *(A::LinearAlgebra.LQ{TA,S} where S<:AbstractArray{TA,2}, B::Union{DenseArray{TB,1}, DenseArray{TB,2}, Base.ReinterpretArray{TB,1,S,A} where S where A<:Union{SubArray{T,N,A,I,true} where I<:Union{Tuple{Vararg{Real,N} where N}, Tuple{AbstractUnitRange,Vararg{Any,N} where N}} where A<:DenseArray where N where T, DenseArray}, Base.ReinterpretArray{TB,2,S,A} where S where A<:Union{SubArray{T,N,A,I,true} where I<:Union{Tuple{Vararg{Real,N} where N}, Tuple{AbstractUnitRange,Vararg{Any,N} where N}} where A<:DenseArray where N where T, DenseArray}, Base.ReshapedArray{TB,1,A,MI} where MI<:Tuple{Vararg{Base.MultiplicativeInverses.SignedMultiplicativeInverse{Int64},N} where N} where A<:Union{Base.ReinterpretArray{T,N,S,A} where S where A<:Union{SubArray{T,N,A,I,true} where I<:Union{Tuple{Vararg{Real,N} where N}, Tuple{AbstractUnitRange,Vararg{Any,N} where N}} where A<:DenseArray where N where T, DenseArray} where N where T, SubArray{T,N,A,I,true} where I<:Union{Tuple{Vararg{Real,N} where N}, Tuple{AbstractUnitRange,Vararg{Any,N} where N}} where A<:DenseArray where N where T, DenseArray}, Base.ReshapedArray{TB,2,A,MI} where MI<:Tuple{Vararg{Base.MultiplicativeInverses.SignedMultiplicativeInverse{Int64},N} where N} where A<:Union{Base.ReinterpretArray{T,N,S,A} where S where A<:Union{SubArray{T,N,A,I,true} where I<:Union{Tuple{Vararg{Real,N} where N}, Tuple{AbstractUnitRange,Vararg{Any,N} where N}} where A<:DenseArray where N where T, DenseArray} where N where T, SubArray{T,N,A,I,true} where I<:Union{Tuple{Vararg{Real,N} where N}, Tuple{AbstractUnitRange,Vararg{Any,N} where N}} where A<:DenseArray where N where T, DenseArray}, SubArray{TB,1,A,I,L} where L where I<:Tuple{Vararg{Union{Int64, AbstractRange{Int64}, Base.AbstractCartesianIndex},N} where N} where A<:Union{Base.ReinterpretArray{T,N,S,A} where S where A<:Union{SubArray{T,N,A,I,true} where I<:Union{Tuple{Vararg{Real,N} where N}, Tuple{AbstractUnitRange,Vararg{Any,N} where N}} where A<:DenseArray where N where T, DenseArray} where N where T, Base.ReshapedArray{T,N,A,MI} where MI<:Tuple{Vararg{Base.MultiplicativeInverses.SignedMultiplicativeInverse{Int64},N} where N} where A<:Union{Base.ReinterpretArray{T,N,S,A} where S where A<:Union{SubArray{T,N,A,I,true} where I<:Union{Tuple{Vararg{Real,N} where N}, Tuple{AbstractUnitRange,Vararg{Any,N} where N}} where A<:DenseArray where N where T, DenseArray} where N where T, SubArray{T,N,A,I,true} where I<:Union{Tuple{Vararg{Real,N} where N}, Tuple{AbstractUnitRange,Vararg{Any,N} where N}} where A<:DenseArray where N where T, DenseArray} where N where T, DenseArray}, SubArray{TB,2,A,I,L} where L where I<:Tuple{Vararg{Union{Int64, AbstractRange{Int64}, Base.AbstractCartesianIndex},N} where N} where A<:Union{Base.ReinterpretArray{T,N,S,A} where S where A<:Union{SubArray{T,N,A,I,true} where I<:Union{Tuple{Vararg{Real,N} where N}, Tuple{AbstractUnitRange,Vararg{Any,N} where N}} where A<:DenseArray where N where T, DenseArray} where N where T, Base.ReshapedArray{T,N,A,MI} where MI<:Tuple{Vararg{Base.MultiplicativeInverses.SignedMultiplicativeInverse{Int64},N} where N} where A<:Union{Base.ReinterpretArray{T,N,S,A} where S where A<:Union{SubArray{T,N,A,I,true} where I<:Union{Tuple{Vararg{Real,N} where N}, Tuple{AbstractUnitRange,Vararg{Any,N} where N}} where A<:DenseArray where N where T, DenseArray} where N where T, SubArray{T,N,A,I,true} where I<:Union{Tuple{Vararg{Real,N} where N}, Tuple{AbstractUnitRange,Vararg{Any,N} where N}} where A<:DenseArray where N where T, DenseArray} where N where T, DenseArray}}) where {TA, TB} in LinearAlgebra at /home/ole/packages/julia-1.4.1/share/julia/stdlib/v1.4/LinearAlgebra/src/lq.jl:180

Do you see that scrollbar? Amazing!

Apropos methods I wanted to mention a trick to more people anyway:

You might wanna have a look at that specific method definition. At least I'm interested to see that madness. At the end you get

/home/ole/packages/julia-1.4.1/share/julia/stdlib/v1.4/LinearAlgebra/src/lq.jl:180

which in my REPL is clickable and you get:

function *(A::LQ{TA}, B::StridedVecOrMat{TB}) where {TA,TB}
    TAB = promote_type(TA, TB)
    lmul!(Factorization{TAB}(A), copy_oftype(B, TAB))
end

which is kinda boring to be honest but you could also use:

julia> methods(*)
...

see that list and get the number. Here 345. Type that number into the REPL and use CTRL+Q and then it gets opened for you. Maybe in vim or nano or something (depending on your settings).

If you want to see it in your editor you need to change ENV["JULIA_EDITOR"] in your ~/.julia/config/startup.jl file. I use

ENV["JULIA_EDITOR"] = "code"

Damn let's talk about multiple dispatch again. This is basically multiple dispatch 😄 You have 357 methods for that function which are all called * and they are just there without having any classes around them. It is still possible to decide which method to call and that is the important part.

In single dispatch languages it is only possible to have a different type for the first argument and then the compiler checks which function/method fits the first argument. In Julia as the name "multiple dispatch" suggests: it looks at all arguments.

Let's create some other add methods shall we?

function add(x::String, y::String)
    x < y ? "$x$y" : "$y$x"
end
function add(x, y::String)
    return add(y, x)
end
add(x::String,y) = "$x$y"

at the end we get:

add (generic function with 4 methods)

Before we continue let's shortly mention that:

function add(x::String, y::String)
    x < y ? "$x$y" : "$y$x"
end

uses an inline if with the same syntax as some other languages with ? and :. It uses $ for variable interpolation and in Julia the last called expression is returned so we don't need return in this case.

and:

julia> methods(add)
# 4 methods for generic function "add":
[1] add(x::String, y::String) in Main at REPL[9]:2
[2] add(x, y::String) in Main at REPL[10]:2
[3] add(x::String, y) in Main at REPL[13]:1
[4] add(x, y) in Main at REPL[3]:1

I decided to make add commutative here even though it is called add and not + but anyway... While getting over that lets call some of those methods:

julia> add(2,3)
7

That still "works" even though a user would maybe like to get the result 5 instead 😄

julia> add(2, "abc")
"abc2"

works as well as I defined it in such a way that the string comes in front of the else. Therefore it produces the same as add("abc", 2).

How to make things like add("Hello ", "World") be commutative? Well we can compare the two strings and check which is lexicographically smaller.

julia> add("Hello ", "World")
"Hello World"
julia> add("World", "Hello ")
"Hello World"

Great that worked!

But how? Well It was just multiple dispatching but the major languages don't support that. It checked which of those 4 methods fit and which one is the most specialized.

If you have a look at the input "Hello ", "World" which are two strings you actually see that all of those four methods could be called:

[1] add(x::String, y::String) in Main at REPL[9]:2
[2] add(x, y::String) in Main at REPL[10]:2
[3] add(x::String, y) in Main at REPL[13]:1
[4] add(x, y) in Main at REPL[3]:1

but [1] is the most specialized version. [4] is basically the fallback one. (There is no requirement that such a fallback exists)

In single dispatch languages only the first argument would be considered such that [1] & [3] would be ambiguous.

If you're curious you might wanna check which method is called for "Hello" < "World".

In Julia you can do this with a macro called @which. (Macros are a topic for another post and I'm really not experienced with those yet)

julia> @which "def" < "abc"
<(x, y) in Base at operators.jl:268

I found it actually quite interesting that the generic method is called here. Maybe we want to see the source code.

A problem here is that our CTRL+Q trick doesn't work anymore to check get to the source because there is no [number] and it's a different macro anyway and we can't click on operators.jl as not the full path is given. Maybe another trick?

julia> @edit "def" < "abc"

brings us to <(x, y) = isless(x, y) which only brings us a tiny bit more to the truth 😄

julia> @edit isless("def", "abc")

gives us

isless(a::AbstractString, b::AbstractString) = cmp(a, b) < 0

where we have dispatched to AbstractString now which is probably also part for another topic of what is an AbstractString and what a String.

One step deeper we get:

function cmp(a::String, b::String)
    al, bl = sizeof(a), sizeof(b)
    c = _memcmp(a, b, min(al,bl))
    return c < 0 ? -1 : c > 0 ? +1 : cmp(al,bl)
end

which isn't that easy to read anymore but the idea behind all this was:

Being able to see the source
Seeing multiple dispatch in action

Wait what about function overloading in C++

Okay we maybe should talk about that as well. I have to admit that I'm in general not a huge fan of OOP and silly examples taught in university with weird classes that nobody ever uses but here you go: My own stupid example of such type!

It is highly inspired by the talk of Stefan Karpinski at JuliaCon 2019:

(Okay it's basically the same with some different animals and a different function name). Nevertheless it's in a blog format here so you can easier copy and paste code.

I'm using a race track!

Let's start with the Julia example:

abstract type Animal end
struct Lizard <: Animal name :: String end
struct Rabbit <: Animal name :: String end

race(l::Lizard, r::Rabbit) = "$(l.name) wins in wall climbing."
race(r::Rabbit, l::Lizard) = "$(r.name) wins in a normal race."
race(a::T, b::T) where T <: Animal = "$(a.name) and $(b.name) run forever."

function meet(a::Animal, b::Animal) 
    println("$(a.name) meets $(b.name) and they race!")
    println("Outcome: $(race(a,b))")
end

bayi = Lizard("Bayi")
sally = Rabbit("Sally")
meet(bayi, sally)
meet(sally, bayi)
meet(sally, sally)

before we come to what it outputs, which should be what we expect, let me just mention that lizards are fast for their size. (Not that this is important for this example 😄 )

Anyway what about a C++ example?

#include <iostream>
#include <string>

using namespace std;

class Animal {
    public: 
 	string name;
};

string race(Animal a, Animal b) { return a.name + " and " + b.name + " run until one wins"; }

void meet(Animal a, Animal b) {
   cout << a.name << " meets " << b.name << endl;
   cout << "Outcome: " << race(a, b) << endl;
}

class Lizard : public Animal {};
class Rabbit : public Animal {};

string race(Lizard l, Rabbit r) { return l.name + " wins in wall climbing"; }
string race(Rabbit r, Lizard l) { return r.name + " wins in a normal race"; }

int main() {
   Lizard bayi; bayi.name = "Bayi";
   Rabbit sally; sally.name = "Sally";

   meet(bayi, sally);
   meet(sally, bayi);
   meet(sally, sally);

   return 0;
}

Here

string race(Animal a, Animal b) { return a.name + " and " + b.name + " run until one wins"; }

is basically a fallback and not as in the julia case where both animals must be of the same type.

This doesn't matter because of the output in C++:

Bayi meets Sally
Outcome: Bayi and Sally run until one wins
Sally meets Bayi
Outcome: Sally and Bayi run until one wins
Sally meets Sally
Outcome: Sally and Sally run until one wins

and they are still running....

Wait a second what happens if we do:

race(sally, bayi)

that works Sally wins in a normal race.

The problem of the meet example is that it is function overloading and not multiple dispatch. Difference: Function overloading using the static type and not the actual type. Basically meet will always call the function

string race(Animal a, Animal b) { return a.name + " and " + b.name + " run until one wins"; }

because all it uses is the static type of of a and b in meet where it only knows it is an Animal.

Multiple dispatch uses the actual type of a and b such that Julia produces:

Bayi meets Sally and they race!
Outcome: Bayi wins in wall climbing.
Sally meets Bayi and they race!
Outcome: Sally wins in a normal race.
Sally meets Sally and they race!
Outcome: Sally and Sally run forever.

here I should probably mention the definition of the method which gets called in the last step:

race(a::T, b::T) where T <: Animal = "$(a.name) and $(b.name) run forever."

Here T is a subtype of Animal and by using T both times we define that they have to be the same type. This means when defining a struct Cat and call race(::Cat, ::Cat) we would get the same result but race(::Cat, ::Lizard) would not work as they are different subtypes of Animal.

On to the next part:

Why Julia doesn't use OOP?

This is quite subjective but a general feeling of me is that a language doesn't need it if it has multiple dispatch.

OOP gives structure to code and sometimes it is used to overcome the limitations of dispatching in those languages by introducing at least single dispatch. I often struggled to extend code in OOP because sometimes you want to use a class but somehow need an extra bit there, which has to be inside a class, that you don't have direct access to. It felt a bit restrictive for me and others will say it's sometimes good when you're not allowed to do everything. Same people say "2"+3 shouldn't work in Javascript and yeah okay I might agree on the last point but sometimes it's damn awesome to see if something just works without converting types :D

Anyway I don't want to say Julia is better in any way or "DON'T USE C++". They just have different concepts.

This post hopefully gave you an idea on what Julia is doing different and how that can be useful. I encourage to try it out a bit for yourself and definitely watch that presentation by Stefan Karpinski as he also shows how this makes talking between packages easier. I don't want to repeat that here or give other examples just yet. In future posts I hopefully can show some real world examples in my code when doing something simpler than my ConstraintSolver where I use it all the time.

I would love to see you again on the blog and please share your thoughts in the comments!

A couple more posts are done 😉

Post #2: REPL & Revise

Check all via the sidebar!

OpenSourc.ES

Julia Basics: Multiple Dispatch

What is dispatch?

How does dispatch work in Julia?

Wait what about function overloading in C++

Why Julia doesn't use OOP?