Last Friday we put version 1.2.2 of Clash on GitHub, uploaded a source release on Hackage and a binary release on Snapcraft. In big part thanks to our lovely community, we’ve found and fixed many bugs. Thanks to some patches, Clash even runs anywhere from 2-20% faster for some common design patterns! Of course, the full list of changes can be found in the CHANGELOG over at GitHub. With this blogpost we’d like to highlight some of the progress we’ve made since the 1.2 release, and elaborate on our future plans.
With the release of Clash 1.2.2 we’ve also released the first iteration of our starter project! The goal of this starter project is to provide a fresh template for a new Clash project. Dependencies can now be added exactly like you would for a Haskell project; simply by adding it to your dependency list in the project’s cabal file. Whereas previously you’d have to write custom scripts, or rely on nix to solve it for you!
Memory leak begone
For a while now we’ve seen pretty bad memory leaks during simulation in commercial deployments of Clash, but haven’t been able to replicate this in a small example. Through various workarounds we’ve been able to reduce the memory usage to make it usable on “reasonable” hardware and moved on to other, by that time higher priority, issues. We were recently confronted with this issue again, reported on our public issue tracker. This time it even had a small example! Time to investigate.
Clash designs operate on infinite streams of data called signals. Signals model (bundles of) wires, where each value corresponds to a stabilized signal at the end of a clock cycle. Signals do not always carry known values. For example, in some hardware configurations the very first value produced by a memory element is unknown. For this purpose, Clash exports an exception-throwing “unknown” value called XException. This presents a problem: for performance reasons you might want to evaluate a value to Weak Head Normal Form (WHNF) or even Normal Form (NF). Usually, you’d use
deepseq to do this, but for typical values used in Clash you can’t, as evaluating an
XException to (WH)NF would stop your circuit’s simulation dead in its tracks! To work around this, Clash 1.0 introduced a type class NFDataX. It allows us to use variants of
deepseq that simply ignore
XExceptions, but otherwise do what you expect them to do. Where you’d normally write:
a `seq` b
you’d write the following instead:
a `seqX` b
A very typical pattern in memory primitives is to (indirectly) use these functions using defaultSeqX like:
a `defaultSeqX` a :- as
The purpose of this expression is to force a (WH)NF evaluation of
a before being able to observe the spine of the signal. I.e., evaluate a value at time t if you want to observe the value at t+1. This solves the same problem as the one described under building up unevaluated expressions at the Haskell wiki. It turns out that this expression was parsed as:
(a `defaultSeqX` a) :- as
instead of what we actually wanted and expected:
a `defaultSeqX` (a :- as)
We’ve since added
infixr 0 `defaultSeqX` . The expression now parses as expected and we’ve observed that this fixes the memory issues our clients were seeing! It turns out that
deepseq has exactly the same problem, for which we’ve submitted a patch.
In short, huge thanks to Dr. Gergő Érdi who pushed us to get to the bottom of the issue and release a fix for it 🎉.
Plans for 1.4
The next major relase of Clash will be 1.4. Two major additions are in the pipeline:
Ground work for a Partial Evaluator. Clash currently uses a transformation based rewrite system to translate Haskell into synthesizable circuits. This transformation system was originally written to make all terms in GHC’s/Clash’s Core IR completely monomorphic and first-order. Bit by bit, more and more transformations were added: removing case statements based on a known subject, unrolling primitive definitions, removing lambdas by propagating their arguments, and many more. As the number of transformations grew, so did the complexity and unpredictability. In fact, we’re currently seeing issues that are very hard to solve within a transformation based system.
Luckily, many of these transformations are basically trying to do one thing: evaluate a Haskell core expression. By building a partial evaluator, we hope to greatly reduce the complexity of Clash. We believe this should, when fully implemented, improve synthesis times by at least an order of magnitude. We don’t expect it to be fully implemented before Clash 1.6 however.
Shake rules for Clash. Clash is often but a small part in a great toolchain needed to get hardware running. The experience can often lead to a frustrating rabbit hole of TCL scripts - if automated at all. By publishing a “blessed” package containing Shake integration we hope to consolidate community effort, and be able to build support for a large number of toolchains. We’ll be basing this off of Dr. Gergő Érdi’s work over at github.com/gergoerdi/clash-shake.
Many thanks to our lovely community once again, and happy Clashing!