Creation date: 2020-06-22
From time to time I hear that my posts are a little bit too deep and complicated for beginners. I totally agree: especially in my newest series on "How to build a constraint solver from scratch?" which is probably a never ending series. I do like to blog about it and will continue but this series is different.
Julia is a relatively young language and not too many people are using it. It thrives in scientific computing but I believe that it can be used for general computing (probably where start up time is not that relevant) i.e. I do use it for creating this blog with Franklin which I blogged a bit about as well here.
This series tries to explain some of the core concepts of Julia and maybe some packages to beginners. I, the explainer of this stuff here, am by no means an expert in any of these. If you know this blog then you might know that I'm trying to explain stuff directly after I've learned them to hopefully be able to communicate better than some people who do this their entire life and are experts in it. Blogs from experts are sometimes hard to follow for me, so maybe also for you. I'll let them proof read my blog post just to be sure that there is nothing wrong with what I explain ;)
Before we get into the content: Are you interested in getting updates on this series as well as a short email about new posts on opensourc.es ?
Free Patreon newsletterIn every programming language there are functions
which have the purpose of providing structure and reusability of code. The functions have a name
and some arguments
. They are defined like
def add(x, y):
return x + y
In Python this can be called with add(2,3)
which gives the expected 5
but also add("2","3")
gives '23'
which might make sense or not depending on who you ask :D
Now how about add("2", 3)
?
This results in an error TypeError: can only concatenate str (not "int") to str
which again can make sense. Well I would say it does but Javascript does something different:
function add(x,y) {
return x+y
}
will produce "23"
for that.
Then there are of course languages like C++
where this all is not that simple because we need to introduce types:
int add(int x, int y) {
return x+y;
}
There it is clear for everyone that it takes two int
and returns one int
. Because it is a compiled language we would get an error directly when trying to call it with add("2", 3)
.
Now what is the point? We can see different concepts here with having static types as in C++ and dynamic typing in JS and Python. They both have pros and cons which is probably obvious. One is easier to reason about and the others are maybe easier to hack around with.
Let's go into dispatching on those three languages:
In Python it is: Ah okay the user wants to call add
:
We have the add
function
Lets call it with the arguments.
Good that works as it expects two arguments and we have two.
Then we might get to the point where the function doesn't work i.e add("2", 3)
and throw an error or it works out and the answer is returned.
It is basically the same in JS. There are possibility some "extensions" like Typescript where things might happen differently. I haven't programmed with any of those languages lately so keep that in mind.
In C++
more things are going on as each variable has a static type and one can check directly whether this fits or not. This means as we can later see that there can be more functions with the same name. I'll explain the difference between function overloading and multiple dispatch in that part ;)
In all of these languages we can have classes such that we might have a class Manufacturer
and we can define add(self, thing)
or something like that and can call the function with manufacturer.add(thing)
. This can be kind of seen as single dispatch. Dispatch is more or less the process of deciding which function to call. Here it depends on the type of manufacturer
. Is it a Manufacturer
or a Box
? For a Box
we might have defined a class Box
and add(self, box)
inside of it.
For people coding in Julia for longer this might sound like a weird concept. Actually writing about it, I am thinking: How would I do some things, where I use multiple dispatch all the time (like in the ConstraintSolver) in one of those languages?
Before I explain multiple dispatch I want to note:
Yes there are ways in Python for single dispatching like @singledispatch but given that the main posts I found when searching are 3-4 years old I doubt that a lot are using it :D
And it is still single dispatch.
Now that there is that out of our way let us have a look at one example in Julia:
add(x, y) = x+y
just to show a neat little way of defining one-line functions... (or as they are called in Julia: Methods)
The interesting thing is when you type this in the Julia REPL (Read-eval-print loop) you get:
julia> add(x, y) = x+y
add (generic function with 1 method)
This means that we have now one function with currently a single method attached to it.
Calling that function works basically like in Python (from the user perspective for now). It doesn't work for strings though.
+
for strings: Julia is a mathematical language and +
is commutative whereas concatenating strings is not. So "2"+"3"
is not "3"+"2"
. Therefore Julia decided to use *
instead. Which is commutative for numbers but not matrices for example.Okay where did I interrupt myself? ... Ah yeah so we have an add
function with one method now in the REPL. That looks like we might be able to add a new one, right?
+
function with a lot of different implementations: called methods.julia> add(x,y) = 2x+y
add (generic function with 1 method)
okay that did not work because it still allows all types of inputs (and throws an error later when it doesn't work) because the compiler had no way to decide which functionmethod to call.
Should it guess? No, it just overwrites the old method.
A small side step again: Let's check what happens when we call add("2", "3")
julia> add("a", "b")
ERROR: MethodError: no method matching *(::Int64, ::String)
Closest candidates are:
*(::Any, ::Any, ::Any, ::Any...) at operators.jl:529
*(::Missing, ::AbstractString) at missing.jl:174
*(::T, ::T) where T<:Union{Int128, Int16, Int32, Int64, Int8, UInt128, UInt16, UInt32, UInt64, UInt8} at int.jl:54
...
Stacktrace:
[1] add(::String, ::String) at ./REPL[3]:1
[2] top-level scope at REPL[4]:1
That error occurs when calling 2x
where it figured that 2
is an integer and x = "2"
is a string. It gives us information of what kind of types it can multiply.
Let's pick one:
*(::T, ::T) where T<:Union{Int128, Int16, Int32, Int64, Int8, UInt128, UInt16, UInt32, UInt64, UInt8} at int.jl:54
This tells us that we can multiply two numbers of those types when they are the same. So UInt8
with UInt8
but I come to that syntax later.
You might wonder how many of those *
functionsmethods there are: 357
is the answer which you get when typing
julia> methods(*)
...
which gives you a long list with all kind of weird types where it sometimes spreads over several lines. I mean what is this? :D
[345] *(A::LinearAlgebra.LQ{TA,S} where S<:AbstractArray{TA,2}, B::Union{DenseArray{TB,1}, DenseArray{TB,2}, Base.ReinterpretArray{TB,1,S,A} where S where A<:Union{SubArray{T,N,A,I,true} where I<:Union{Tuple{Vararg{Real,N} where N}, Tuple{AbstractUnitRange,Vararg{Any,N} where N}} where A<:DenseArray where N where T, DenseArray}, Base.ReinterpretArray{TB,2,S,A} where S where A<:Union{SubArray{T,N,A,I,true} where I<:Union{Tuple{Vararg{Real,N} where N}, Tuple{AbstractUnitRange,Vararg{Any,N} where N}} where A<:DenseArray where N where T, DenseArray}, Base.ReshapedArray{TB,1,A,MI} where MI<:Tuple{Vararg{Base.MultiplicativeInverses.SignedMultiplicativeInverse{Int64},N} where N} where A<:Union{Base.ReinterpretArray{T,N,S,A} where S where A<:Union{SubArray{T,N,A,I,true} where I<:Union{Tuple{Vararg{Real,N} where N}, Tuple{AbstractUnitRange,Vararg{Any,N} where N}} where A<:DenseArray where N where T, DenseArray} where N where T, SubArray{T,N,A,I,true} where I<:Union{Tuple{Vararg{Real,N} where N}, Tuple{AbstractUnitRange,Vararg{Any,N} where N}} where A<:DenseArray where N where T, DenseArray}, Base.ReshapedArray{TB,2,A,MI} where MI<:Tuple{Vararg{Base.MultiplicativeInverses.SignedMultiplicativeInverse{Int64},N} where N} where A<:Union{Base.ReinterpretArray{T,N,S,A} where S where A<:Union{SubArray{T,N,A,I,true} where I<:Union{Tuple{Vararg{Real,N} where N}, Tuple{AbstractUnitRange,Vararg{Any,N} where N}} where A<:DenseArray where N where T, DenseArray} where N where T, SubArray{T,N,A,I,true} where I<:Union{Tuple{Vararg{Real,N} where N}, Tuple{AbstractUnitRange,Vararg{Any,N} where N}} where A<:DenseArray where N where T, DenseArray}, SubArray{TB,1,A,I,L} where L where I<:Tuple{Vararg{Union{Int64, AbstractRange{Int64}, Base.AbstractCartesianIndex},N} where N} where A<:Union{Base.ReinterpretArray{T,N,S,A} where S where A<:Union{SubArray{T,N,A,I,true} where I<:Union{Tuple{Vararg{Real,N} where N}, Tuple{AbstractUnitRange,Vararg{Any,N} where N}} where A<:DenseArray where N where T, DenseArray} where N where T, Base.ReshapedArray{T,N,A,MI} where MI<:Tuple{Vararg{Base.MultiplicativeInverses.SignedMultiplicativeInverse{Int64},N} where N} where A<:Union{Base.ReinterpretArray{T,N,S,A} where S where A<:Union{SubArray{T,N,A,I,true} where I<:Union{Tuple{Vararg{Real,N} where N}, Tuple{AbstractUnitRange,Vararg{Any,N} where N}} where A<:DenseArray where N where T, DenseArray} where N where T, SubArray{T,N,A,I,true} where I<:Union{Tuple{Vararg{Real,N} where N}, Tuple{AbstractUnitRange,Vararg{Any,N} where N}} where A<:DenseArray where N where T, DenseArray} where N where T, DenseArray}, SubArray{TB,2,A,I,L} where L where I<:Tuple{Vararg{Union{Int64, AbstractRange{Int64}, Base.AbstractCartesianIndex},N} where N} where A<:Union{Base.ReinterpretArray{T,N,S,A} where S where A<:Union{SubArray{T,N,A,I,true} where I<:Union{Tuple{Vararg{Real,N} where N}, Tuple{AbstractUnitRange,Vararg{Any,N} where N}} where A<:DenseArray where N where T, DenseArray} where N where T, Base.ReshapedArray{T,N,A,MI} where MI<:Tuple{Vararg{Base.MultiplicativeInverses.SignedMultiplicativeInverse{Int64},N} where N} where A<:Union{Base.ReinterpretArray{T,N,S,A} where S where A<:Union{SubArray{T,N,A,I,true} where I<:Union{Tuple{Vararg{Real,N} where N}, Tuple{AbstractUnitRange,Vararg{Any,N} where N}} where A<:DenseArray where N where T, DenseArray} where N where T, SubArray{T,N,A,I,true} where I<:Union{Tuple{Vararg{Real,N} where N}, Tuple{AbstractUnitRange,Vararg{Any,N} where N}} where A<:DenseArray where N where T, DenseArray} where N where T, DenseArray}}) where {TA, TB} in LinearAlgebra at /home/ole/packages/julia-1.4.1/share/julia/stdlib/v1.4/LinearAlgebra/src/lq.jl:180
Do you see that scrollbar? Amazing!
Apropos methods
I wanted to mention a trick to more people anyway:
You might wanna have a look at that specific method definition. At least I'm interested to see that madness. At the end you get
/home/ole/packages/julia-1.4.1/share/julia/stdlib/v1.4/LinearAlgebra/src/lq.jl:180
which in my REPL is clickable and you get:
function *(A::LQ{TA}, B::StridedVecOrMat{TB}) where {TA,TB}
TAB = promote_type(TA, TB)
lmul!(Factorization{TAB}(A), copy_oftype(B, TAB))
end
which is kinda boring to be honest but you could also use:
julia> methods(*)
...
see that list and get the number. Here 345
. Type that number into the REPL and use CTRL+Q
and then it gets opened for you. Maybe in vim
or nano
or something (depending on your settings).
If you want to see it in your editor you need to change ENV["JULIA_EDITOR"]
in your ~/.julia/config/startup.jl
file. I use
ENV["JULIA_EDITOR"] = "code"
Damn let's talk about multiple dispatch again. This is basically multiple dispatch 😄 You have 357 methods for that function which are all called *
and they are just there without having any classes around them. It is still possible to decide which method to call and that is the important part.
In single dispatch languages it is only possible to have a different type for the first argument and then the compiler checks which function/method fits the first argument. In Julia as the name "multiple dispatch" suggests: it looks at all arguments.
Let's create some other add
methods shall we?
function add(x::String, y::String)
x < y ? "$x$y" : "$y$x"
end
function add(x, y::String)
return add(y, x)
end
add(x::String,y) = "$x$y"
at the end we get:
add (generic function with 4 methods)
Before we continue let's shortly mention that:
function add(x::String, y::String)
x < y ? "$x$y" : "$y$x"
end
uses an inline if
with the same syntax as some other languages with ?
and :
. It uses $
for variable interpolation and in Julia the last called expression is returned so we don't need return
in this case.
and:
julia> methods(add)
# 4 methods for generic function "add":
[1] add(x::String, y::String) in Main at REPL[9]:2
[2] add(x, y::String) in Main at REPL[10]:2
[3] add(x::String, y) in Main at REPL[13]:1
[4] add(x, y) in Main at REPL[3]:1
I decided to make add
commutative here even though it is called add
and not +
but anyway... While getting over that lets call some of those methods:
julia> add(2,3)
7
That still "works" even though a user would maybe like to get the result 5
instead 😄
julia> add(2, "abc")
"abc2"
works as well as I defined it in such a way that the string comes in front of the else
. Therefore it produces the same as add("abc", 2)
.
How to make things like add("Hello ", "World")
be commutative? Well we can compare the two strings and check which is lexicographically smaller.
julia> add("Hello ", "World")
"Hello World"
julia> add("World", "Hello ")
"Hello World"
Great that worked!
But how? Well It was just multiple dispatching but the major languages don't support that. It checked which of those 4 methods fit and which one is the most specialized.
If you have a look at the input "Hello ", "World"
which are two strings you actually see that all of those four methods could be called:
[1] add(x::String, y::String) in Main at REPL[9]:2
[2] add(x, y::String) in Main at REPL[10]:2
[3] add(x::String, y) in Main at REPL[13]:1
[4] add(x, y) in Main at REPL[3]:1
but [1]
is the most specialized version. [4]
is basically the fallback one. (There is no requirement that such a fallback exists)
In single dispatch languages only the first argument would be considered such that [1]
& [3]
would be ambiguous.
If you're curious you might wanna check which method is called for "Hello" < "World"
.
In Julia you can do this with a macro called @which
. (Macros are a topic for another post and I'm really not experienced with those yet)
julia> @which "def" < "abc"
<(x, y) in Base at operators.jl:268
I found it actually quite interesting that the generic method is called here. Maybe we want to see the source code.
A problem here is that our CTRL+Q
trick doesn't work anymore to check get to the source because there is no [number]
and it's a different macro anyway and we can't click on operators.jl
as not the full path is given. Maybe another trick?
julia> @edit "def" < "abc"
brings us to <(x, y) = isless(x, y)
which only brings us a tiny bit more to the truth 😄
julia> @edit isless("def", "abc")
gives us
isless(a::AbstractString, b::AbstractString) = cmp(a, b) < 0
where we have dispatched to AbstractString
now which is probably also part for another topic of what is an AbstractString
and what a String
.
One step deeper we get:
function cmp(a::String, b::String)
al, bl = sizeof(a), sizeof(b)
c = _memcmp(a, b, min(al,bl))
return c < 0 ? -1 : c > 0 ? +1 : cmp(al,bl)
end
which isn't that easy to read anymore but the idea behind all this was:
Being able to see the source
Seeing multiple dispatch in action
Okay we maybe should talk about that as well. I have to admit that I'm in general not a huge fan of OOP and silly examples taught in university with weird classes that nobody ever uses but here you go: My own stupid example of such type!
It is highly inspired by the talk of Stefan Karpinski at JuliaCon 2019:
(Okay it's basically the same with some different animals and a different function name). Nevertheless it's in a blog format here so you can easier copy and paste code.
I'm using a race track!
Let's start with the Julia example:
abstract type Animal end
struct Lizard <: Animal name :: String end
struct Rabbit <: Animal name :: String end
race(l::Lizard, r::Rabbit) = "$(l.name) wins in wall climbing."
race(r::Rabbit, l::Lizard) = "$(r.name) wins in a normal race."
race(a::T, b::T) where T <: Animal = "$(a.name) and $(b.name) run forever."
function meet(a::Animal, b::Animal)
println("$(a.name) meets $(b.name) and they race!")
println("Outcome: $(race(a,b))")
end
bayi = Lizard("Bayi")
sally = Rabbit("Sally")
meet(bayi, sally)
meet(sally, bayi)
meet(sally, sally)
before we come to what it outputs, which should be what we expect, let me just mention that lizards are fast for their size. (Not that this is important for this example 😄 )
Anyway what about a C++ example?
#include <iostream>
#include <string>
using namespace std;
class Animal {
public:
string name;
};
string race(Animal a, Animal b) { return a.name + " and " + b.name + " run until one wins"; }
void meet(Animal a, Animal b) {
cout << a.name << " meets " << b.name << endl;
cout << "Outcome: " << race(a, b) << endl;
}
class Lizard : public Animal {};
class Rabbit : public Animal {};
string race(Lizard l, Rabbit r) { return l.name + " wins in wall climbing"; }
string race(Rabbit r, Lizard l) { return r.name + " wins in a normal race"; }
int main() {
Lizard bayi; bayi.name = "Bayi";
Rabbit sally; sally.name = "Sally";
meet(bayi, sally);
meet(sally, bayi);
meet(sally, sally);
return 0;
}
Here
string race(Animal a, Animal b) { return a.name + " and " + b.name + " run until one wins"; }
is basically a fallback and not as in the julia case where both animals must be of the same type.
This doesn't matter because of the output in C++:
Bayi meets Sally
Outcome: Bayi and Sally run until one wins
Sally meets Bayi
Outcome: Sally and Bayi run until one wins
Sally meets Sally
Outcome: Sally and Sally run until one wins
and they are still running....
Wait a second what happens if we do:
race(sally, bayi)
that works Sally wins in a normal race
.
The problem of the meet
example is that it is function overloading and not multiple dispatch. Difference: Function overloading using the static type and not the actual type. Basically meet
will always call the function
string race(Animal a, Animal b) { return a.name + " and " + b.name + " run until one wins"; }
because all it uses is the static type of of a
and b
in meet
where it only knows it is an Animal
.
Multiple dispatch uses the actual type of a
and b
such that Julia
produces:
Bayi meets Sally and they race!
Outcome: Bayi wins in wall climbing.
Sally meets Bayi and they race!
Outcome: Sally wins in a normal race.
Sally meets Sally and they race!
Outcome: Sally and Sally run forever.
here I should probably mention the definition of the method which gets called in the last step:
race(a::T, b::T) where T <: Animal = "$(a.name) and $(b.name) run forever."
Here T
is a subtype of Animal
and by using T
both times we define that they have to be the same type. This means when defining a struct Cat
and call race(::Cat, ::Cat)
we would get the same result but race(::Cat, ::Lizard)
would not work as they are different subtypes of Animal
.
On to the next part:
This is quite subjective but a general feeling of me is that a language doesn't need it if it has multiple dispatch.
OOP gives structure to code and sometimes it is used to overcome the limitations of dispatching in those languages by introducing at least single dispatch. I often struggled to extend code in OOP because sometimes you want to use a class but somehow need an extra bit there, which has to be inside a class, that you don't have direct access to. It felt a bit restrictive for me and others will say it's sometimes good when you're not allowed to do everything. Same people say "2"+3
shouldn't work in Javascript and yeah okay I might agree on the last point but sometimes it's damn awesome to see if something just works without converting types :D
Anyway I don't want to say Julia is better in any way or "DON'T USE C++". They just have different concepts.
This post hopefully gave you an idea on what Julia is doing different and how that can be useful. I encourage to try it out a bit for yourself and definitely watch that presentation by Stefan Karpinski as he also shows how this makes talking between packages easier. I don't want to repeat that here or give other examples just yet. In future posts I hopefully can show some real world examples in my code when doing something simpler than my ConstraintSolver where I use it all the time.
I would love to see you again on the blog and please share your thoughts in the comments!
A couple more posts are done 😉
Post #2: REPL & Revise
Check all via the sidebar!
Other neat little posts about Julia:
If you're a beginner you might be interested in: Getting mentored
Thanks for reading and special thanks to my six patrons!
List of patrons paying more than 4$ per month:
Site Wang
Gurvesh Sanghera
Szymon Bęczkowski
Currently I get 19$ per Month using Patreon and PayPal.
For a donation of a single dollar per month you get early access to the posts. Try it out at the start of a month and if you don't enjoy it just cancel your subscription before pay day (end of month).
I'll keep you updated on Twitter OpenSourcES as well as my more personal one: Twitter Wikunia_de