Interview with Daniel Bachler

There are rumors about Elm 1.0 coming soon, so today we decided to interview Daniel Bachler, a guy with a commercial experience in this and many other languages. Daniel is a true polymath, with a deep engineering expertise, so we recommend to check his site and follow him on Twitter in case you're interested in Elm, F#, functional programming, or computer science in general.

Daniel Bachler

Software Engineer

Can you introduce yourself to our audience?

Sure! I'm 36 years old, I live in Berlin and I work as a software engineer for Douglas Connect in Switzerland and together with my wife we also run a photography studio and shoot a lot of weddings together during summer all over Europe and sometimes beyond.

At Douglas Connect we develop solutions in the field of toxicology with the vision to reduce animal testing and replacing it with carefully guided in-vitro tests and computer models. A lot of my day to day work is about improving data management so we can even start building machine learning models in the first place. We use a few different languages and technologies, chief among them F#, Elm and Python.

Your professional milestones for us to better get the context?

I got into programming when at 14 I switched to a specialized school for IT we have in Austria. We did not have a lot of theory but we were taught quite a few languages on a practical level: first C, then C++ on MS DOS, and then Visual C++ on Windows, Cobol, PL/1, IBM390 Assemler, Prolog and maybe a few more. These were mostly old languages even then, but it was good to get exposed to different concepts. After school I worked with C# (which had just come out) for a long time. Together with two colleagues we built a pretty sophisticated data analysis app from scratch and that exposed all of us to a lot of different areas: from making comfortable UX to writing DSL parsers and high performance multithreaded optimization code.

Towards the end of that period I got interested in looking for other approaches outside the object oriented C# world that I knew well. I wanted to learn Haskell since a friend had told me many fascinating and frankly also somewhat scary things about it. I tried it but found it hard to grasp (this was 2013 or so and learning resources were more basic than they are today). Around that time I also stumbled across F-Sharp for Fun and Profit and Scott Wlaschin's explanations of functional concepts in F# made a lot of things click for me that were then also very helpful in Haskell.

In 2015 I wanted to write a photo slideshow editor to sync photos to music and that finally brought me to Elm. I had tried Javascript and React and Redux but I quickly found refactoring Javascript to be too much of a pain and so I tried Elm, and that was the first time I really started writing non-trivial amounts of code in an ML family language.

In 2016 the CTO of Douglas Connect was also interested in Elm and found some of my forum and blog posts and they ended up hiring me. We developed a few of our applications in Elm, and we're now using a lot of F# (both in the backend and frontend as well). F# is in an interesting position today as in addition to the Windows-only .NET Framework it runs on either the cross platform .NET CORE framework or it can be compiled to Javascript with pretty good interop with existing Javascript libraries and even semi-automatic type generation from Typescript definitions (see the SAFE stack for a nice setup that combines these for a full stack F# experience.)

ML-like languages are on the rise today: F#, Elm, ReasonML... – how would you compare them?

I think they are all interesting languages with a lot of important similarities and shared features. Maybe the most important for me personally are algebraic datatypes – something that is sorely lacking in a lot of more mainstream languages. For example, if you think about how to model an operation status with a progress while running, a floating point result if the operation succeeds and a string error message if it fails, a language with sum types can express this so beautifully (like here in Elm syntax):

type OperationStatus
	= Running Int Int
	| Success Float
	| Error String

All of the ML family languages I know also have pattern matching and exhaustiveness checking, and that together is a really powerful tool to write code that delivers on the promise of "making invalid states unrepresentable" (there are a few of great talks on this phrase btw.).

On the other hand if you think about how something like this is represented in languages that don't support sum types, then you end up with one of two solutions. Either you create a record/class that can store all these pieces of information at the same time but only some of them make sense at any time (e.g. the error message and progress values are unused if the status enum indicates success). Or you resort to creating a class hierarchy for this, maybe with some abstract base class and then things get messy as you end up pulling all kinds of concerns into those classes or end up testing on the concrete type, which kind of defeats the purpose of the class hierarchy and will easily break if you add new cases to your OperationStatus type hierarchy.

As for the other similarities, there is of course the syntax that is similar, they have static typing, type inference, first class functions etc, but you can find some of these features in many languages nowadays. If we look at the differences, two important axis of differentiation come to mind.

One big difference between F# and OCaml on one side and Elm, Haskell, and Purescript on the other is that the former allow mutation and implicit side effects while the latter ones do not. I like the more rigid framework of pure languages and the fact that in Haskell and Purescript the type system allows you to discern between side-effectful and pure code. But there are also situations in which mutation and more imperative constructs can sometimes be more intuitive and easier to tune for performance than e.g. writing intricate folds.

Another important dimension to compare them is the complexity of their type systems. Here F# and Elm have simpler type systems and Haskell and Purescript have features like type classes and higher kinded types. Which allows them to create higher level abstractions in their type system like the Functor typeclass that defines a generic map. In Elm you have to redefine map for every type you want it for and can't express "the set of types with a map implementation". In F# you might be able to do some weird things with generics (what parametric polymorphism is called in the .NET world) and type constraints but nobody bothers.

Haskell is an interesting and unique case in that it's the only language of the bunch that is lazy by default. This is a very interesting feature and it allows a uniform type signature for lazy and eager values. The downside is that it's sometimes hard for non-experts to reason about the time and space complexity of code.

Apart from language features, there are of course very important differences in the runtime systems or platforms these languages target – it's hard to use Haskell in the browser or Elm on the backend. These differences are probably the most important reason to pick one of these languages over another – e.g. if you want to write in an ML style language and need to interface with code from the .NET universe, then F# is the obvious answer; if you need your code to run in the browser and interface extensively with existing javascript libraries then maybe Purescript is the nicest choice.

There is also a very interesting social distinction. I think Elm especially tried very hard to be beginner friendly and rejected some abstractions to stay easier to learn. This attracts a certain kind of people, which leads to a self-fueling circle. Purescript, on the other hand, very much went into the "all the cool type features from Haskell, but without the legacy stuff" kind of direction and it, in turn, has attracted people that like thinking about and exploring higher abstractions. I think both are very valid, and I think it is great that we have such choices now so everyone can find a community they feel comfortable with (which of course might change over time).

PureScript is getting closer to 1.0 as well. What's your opinion about this language/ecosystem?

I like it a lot but unfortunately I never had the time to do anything bigger with it. While I like Elm, I think it is great to have another ML family targeting Javascript that goes a different way.

One very cool thing about Purescript is how far you can get the type system to help you, especially when compared with a significantly simpler language like Elm. This starts with simple things like Purescript allowing you to define your own instances for Ord so that you can use arbitrary data types as keys in dictionaries. Or the fact that Purescript has newtypes which are a lightweight zero-cost abstraction around simple data types that add a lot of type safety. Purescript also comes with great metaprogramming capabilities in the form of generics (not the parametric polymorphism think that Java/C# call generics!) and the generic deriving mechanism. This allows a lot of boring, repetitive work to be automated (compare e.g. the manual, rather tedious writing of JSON encoders/decoders in Elm with Purescript's Simple.JSON module). The last version of Purescript also added more type level programming so you can now implement compile time type-safe SQL string interpolation and things like that.

What I found very interesting was the change of the Eff type in the latest version of Purescript – before 0.12 Purescript's Eff type was using the row type feature. I thought that this was very nice because just looking at a function you could see which kinds of side effects it would have (e.g. a function could define an Eff type with labels CONSOLE and HTTP but nothing else). But it seems that it did not turned out that well in the end – these were just (type level) string labels and different libraries could accidentally (or intentionally) reuse the same labels which could lead to problems.

So they switched to a simpler system with 0.12 that is closer to Haskell's IO monad (so you just know whether a function is pure or effectful but no more). If I understand it correctly you can still implement the row based system yourself on top of the new Effect but, by default, the row types are no longer used to track effects. Which is an interesting case, I think, of a mental tool that sounds great in theory but was apparently not validated in practice.

There is also a lot of fruitful exploration around wrapping Javascript libraries like React and testing different approaches for what kind of type features are the sweet spot between safety and ease of use (last I checked there were like 5 different approaches of wrapping React for example) and this kind of exploration is actively encouraged in Purescript.

The flip side of all this is that it looks to me that the Purescript ecosystem is a bit more fragmented. In Elm, since there is only one way to write applications and the type system only goes this far, most libraries are immediately compatible with each other and the learning curve for any arbitrary library is very low, even for people new to Elm. In Purescript, if you come into contact with profunctor optics/lenses the first time it can be very rewarding but also quite time consuming.

I hope that Purescript will continue to prosper and hope to use it more in the future.

Elm went through a lot of breaking changes in recent years. How do you see those changes now,

in retrospective (the renouncement of FRP approach in particular)?

The renouncement of FRP was an interesting case. I think it made sense since it simplified the language somewhat and it looked like the Elm Architecture was the way forward for most normal web applications anyhow. At the same time it's a bit sad because it was a very interesting approach to build Signals so deeply into the language and have a "fold from the past" as one of the core library functions. I think we haven't really found the sweet spot yet in how to deal with change over time in code and the reactive and FRP ideas warrant some more exploration, especially when combined with compile time type checking of signal graphs and/or better tooling support like we see with the marble diagrams in RxJS.

As much as I like Elm as a language and ecosystem, I do have some problems with how the language is steered and this was also part of the reason why we are moving away from Elm at Douglas Connect. My issue is less with breaking changes and more with the way improvements are handled. For example, one core selling point of Elm is that there are no runtime errors, yet there are several documented crashing bugs that have lingered for a long time and just not been fixed, often for no understandable reason.

Or the change of focus between releases – if I remember correctly, 0.17 came with the announcement that the core Elm runtime would soon try to cover "most of the HTML 5 Web API" or whatever it was called, but then big chunks just never came and the focus shifted to other areas. What these areas are is not really clearly communicated unless you follow the language dev channels. Many straightforward use cases still mandate using ports even though it would be very easy to support them (e.g. binary file uploads from the browser are something that I wanted since 0.16 and it's still not possible to do this without doing this with Javascript). Finally, I think the disallowing of any kind of packaging javascript is unfortunate. I understand that if you want to maintain a "no runtime errors" experience, you can't allow libraries to also ship JavaScript. But I think there should be an opt-in "unstable" package channel that lets you include port setups and JavaScript code so library authors can wrap interactions with existing JS libraries. But, of course, that is just my personal opinion and Elm is clearly following another path.

What about Elm's progress on the backend. Any frameworks, SSR tools, plans you're aware of?

I stopped following the dev channels some time ago, so I can't really provide much insight here. I think server side rendering is actively being worked on but, unless there comes a new way to do FFI in Elm, I doubt that we'll see it getting much use on the backend. If you want to use an ML language there, I think Haskell, Purescript, OCaml and F# are all established choices that make a lot of sense.

You organized a number of Elm meetups and clearly have a lot of inside experience. How would you describe the Elm community? What about people coming to Elm – are most of them JS programmers or, say, Haskellers?

I think Elm has a very nice community that is very welcoming to newcomers. For a lot of newcomers, Elm is the first ML family language. There are some experienced people with extensive Haskell knowledge, but I have a feeling that quite a few of them moved on to Purescript. Maybe because once you have tasted the sweetness of typeclasses and generics you just can't go back ;)

I think that it's extremely valuable to have Elm in the family of ML languages because they are all quite alien if all you know is Java or Javascript, or C-like languages. The Berlin Elm meeting is run by Andrey Kuzmin for a while now, but I am thinking about doing a new meeting, covering a wide range of statically typed, function programming languages in Berlin in autumn. In that meeting we would switch between Elm, Haskell, F#, OCaml, Purescript, Rust, maybe some Swift or Julia – you get the idea. I think it would be very useful to get more "mainstream" software developers acquainted with these languages, so that as an industry we can speed up the transition to languages that have at least algebraic datatypes :)

You have an experience with scientific programming. Most people would immediately recall R and Python...

Which other, maybe newer languages, do you think, have a potential in this area?

Yes, this is a constant source of discomfort for me – R is really an awkward language, but so many libraries have been built on top of it, especially in bioinformatics, that you can't just start rewriting everything in your language of choice. What we often end up doing, is having small web services written in R or Python that do one specific thing, and then write the coordination and UI stuff in other languages (Elm and F# mostly). This way you can have the support of a good type system where you need it (in the complexity of coordinating state and complex UIs) but still use the vast scientific libraries of R and Python.

As for different languages, F# is in an interesting position here because it has some support for interfacing Python and R and it is available as a Jupyter kernel (interactive code notebooks) and as a first class language for example on the Azure notebooks. So we might explore that a bit more to maybe write the web service wrappers in F# even and then just call out to specific R or Python functions from F#.

Julia is another interesting language that is built as a language for scientific computing in a modern, highly parallelized world but I haven't really had a chance to dive into that.

Finally for machine learning it looks like the immensely popular TensorFlow framework for deep neural networks might adopt Swift of all languages as its statically typed alternative to Python. This could be very interesting because it would let you do static code analysis to check if e.g. your tensor sizes all match up, instead of encountering these errors at runtime.

Can you share some of your ambitious plans with us? Like "Master Category Theory" or "Write my own language" :)

In the near future I mostly want to improve scientific data management as part of my work at Douglas Connect and give back to the F# community by helping with documentation and work on tooling. With my good friend Fredrik I would like to explore Haskell and Purescript so that I would hopefully get more fluent in both. As I said above, I am also thinking about a language agnostic statically typed FP meetup or something like that to spread the love for these languages more broadly :) Further out who knows – maybe I'll get involved in the P2P technologies that are emerging as alternatives to centralized social networks but combine that with an ML family language or try to spread its use in scientific computing.

What can you advice to people starting their ways into programming right now? Which goals to pursue,

which pitfalls to avoid?

I think Elm is a very good language to start with as it allows you to write web applications and you learn a lot of important concepts in a very nice and clear way without too many distractions. After that I think Python is a very useful language to know and it broadens the areas you can work with immensely: from data science to devops to web backend development. Hopefully by then you will have learned the beauty of immutability and treat mutation and side effects with the respect and care they deserve :) After that it really depends on what you want to do. I think learning Haskell is very rewarding and extending your mental horizon (or Purescript if you want to do a lot of a frontend or Node stuff). Erlang/Elixir have a lot of cool ideas, even though I miss the safety of a static type system.

For software engineers who have already learned one or more languages I think the ML family of languages is a very worthwhile target of study. If I look back at my own history of code, I can only say that I wish I had learned an ML family language directly after my first OO language (or even before). The way we were taught OO really ended up making my code much more coupled and brittle than it had to be, and for a long time. When thinking about a data model, the first question I asked is "Can this be described using an IS-A relationship?". E.g. in the status example above, I would have created an interface IOperationStatus and then created 3 implementations for it because "Running is an IOperationStatus". But what do they really have in common? For the purpose of expressing the functionality in code, does it help you anything that the Running and Error and Success classes all inherit the same interface?

The reality is that most software today is developed in languages like Java, Javascript, C# that I think are needlessly error-prone (think NullReference exception and everything being mutable by default) and have relatively awkward dimensions of abstractions: Factories and Singletons, and Dependency Injection that are all basically non-issues in ML family languages. So I think it is worthwhile learning some of these, if for no other reason than to be more compatible to existing jobs. (For me personally the language I can or cannot use makes quite a difference in how happy I am in a job, though I think that there are more important criteria to select jobs by).

But it is very useful to learn ML family languages and, given the chance, we should push for languages that allow us to write better code. I think we also need more people who are curious and study what is out there that allows us to improve our discipline, but who also try to distribute their knowledge in a non-condescending way. So if you are starting out today, stay curious and be sceptical of buzzwords and "best practices" and try to search for ways to make software development better – because I think we are only getting started with software and we will need to tackle more and more complexity in the future.

Thank you, Daniel! A lot of interesting and valueable information to think upon. We wish you the best in all your career and private aspirations, and hope to see you here again.