I don’t like threads. They are brittle and dangerous… Even when the languages being used have good thread support (Java and Microsoft C++, for example), even when used by senior developers, and even when used for quite simple “fire off this worker thread so the UI will still be live while a long job runs” tasks. I’ve seen them go wrong in too many ways in the hands of good developers with plenty of multi-threading experience.
The Trouble With Threads
One of my rules when I was a dev team lead was “no threads”. They were simply too easy to get wrong and too hard to debug. You haven’t really lived until you’ve had a debugger attached to several threads simultaneously, single-stepping through each, trying to figure out the logic of a rare race condition or deadlock in someone else’s poorly documented code.
Furthermore, there are usually cleaner single-threaded ways to do many of the things threads are used for, particularly maintaining liveness. Most GUI toolkits these days provide some kind of “yield” functionality that allows events to be processed now and then, for example. These frameworks may well be multi-threaded underneath, but they are also well-tested and will rarely go wrong in ways that impact the end user.
Threads in most other languages start with the most powerful primitives–mutexes, semaphores, condition variables–and build up libraries of less generally dangerous types. Once upon a time during the Dot Com boom a Java developer in an organization I worked for attempted to replace all function calls in his code with threads. So instead of calling a function he created a thread object whose entry point executed the code that would have been in the function call had he called a function, then set up code to join the thread when the “function” call was complete. The powerful Java threading primitives made it easy to do this. They did not make it easy to do it right, though, and after he “moved on to other opportunities” my team spent weeks turning his code back into something that worked and could be reasoned about.
That was a worst case, but I’ve also debugged a multi-threaded Java application that crashed randomly. The bug was written by a senior developer whose work I respect a lot. But there they created a worker thread in the middle of a block of other code–because Java lets you do that, and it was encouraged back in the day–and the developer had mixed up the close braces, so a crucial line of code that should have been the last in the worker thread was in fact in the main thread. Oops. This kind of bug happens all the time in single-threaded code, but in that case it is deterministic and easy to catch. In this case, what happened depended on what the worker thread was doing when control was returned to the main thread and that line was run. Nine times out of ten it was all OK… But ten percent of the time the application crashed. It took me weeks off-and-on testing, static analysis and instrumenting the code to figure out why.
It isn’t easy to avoid pitfalls of that nature in multi-threaded code in most languages.
Untangling The Threads With Tcl
Those experiences were from back in the days when two CPUs was a lot, though, and in the modern world of low-end machines with a lot more than two cores, threads offer advantages that are hard to ignore.
If only there was a language that made it really, really hard to do threads wrong.
And it happens there is: Tcl.
Tcl’s thread module, in contrast, has native primitives that make it really easy to do simple and powerful things with threads, which at the very least reduces the temptation that developers have to do complex and dangerous things with them. Tcl threads by default have (almost) no shared data. Communication is done by passing messages that send events to other thread’s event loops. The content of these events is Tcl scripts for the interpreter in the other thread to execute. Most importantly, if you must share data, there are inherently thread-safe data-types supplied at the language level. Mutexes exist if you must have them, but almost all uses of them are eliminated by the other primitives.
A big part of what makes this possible is that every Tcl thread has its own interpreter. This means that every thread is running in an isolated sandbox that can most easily communicate with other threads via message-passing.
The benefits of this are hard to over-emphasise. Unlike Python, say, where “threads” are all run within a single interpreter and the Global Interpreter Lock chains everything down on the sacrificial altar of the process, which makes it impossible for Python threads to take advantage of multi-core machines, Tcl threads are naturally friendly to multi-core architectures. They are not managed at the interpreter level, but are true OS-level threads each running a Tcl interpreter of their own. This is a good design choice in part because of how lightweight the Tcl interpreter is.
The way Tcl implements threads is sometimes called the “apartment model” of threads, in contrast to the “green threads” which are managed by the interpreter and never get to use multiple cores because of that. Where terminology like “apartment model” and “green threads” came from is not clear.
Even with one-interpreter-per-thread there is still some shared global state because there is still only one process: the current working directory, for example, is a process-level datum, so it should not be relied upon whenever there is more than one thread active. A simple workaround–keeping a local copy of cwd or other global data that is set prior to any threads being spawned–can be used to fix these cases.
Getting Started With Tcl Threads
A good starting point for the Tcl thread package can be found in this chapter from “Practical Programming in Tcl and Tk, 4th ed” by Brent Welch, Ken Jones: http://www.beedub.com/book/4th/Threads.pdf.
The thread package is pleasingly minimal. It supports thread creation and termination, configuring threads, and sending messages to them. It also supports joining threads (waiting for a thread to terminate) which is the only really dangerous operation–if the thread being joined fails to complete, a join operation will wait forever. This potentially hangs the whole application, so it still requires some care.
Sending Messages
Sending messages to threads is also asynchronous by default, which is a bit odd, but probably has some design logic behind it that escapes me, as the people who created it have almost certainly thought more about the problem than I have.
Each thread has its own event loop, which is a foundational part of Tcl, and by calling thread::send one thread can send a script as a message to another thread’s event loop for evaluation, as in this simple example that creates two threads, sends them scripts to be evaluated, and then waits on the result:
set t1 [thread::create] set t2 [thread::create] thread::send -async $t1 "set a 1" result thread::send -async $t2 "set b 2" result for {set i 0} {$i < 2} {incr i} { vwait result }
As described in the ActiveTcl docs this results in the two threads doing their work asynchronously, and the result of each thread being put into the variable result
, which we wait on twice to ensure both threads have completed their work.
Shared state is supported via thread-shared variables in the tsv package. This is as if C had two types of every data type, thread-safe and unsafe, so we could declare:
safe_int mysafeint;
and just use it as a variable shared between threads rather than having to declare an ordinary int and protect it by hand with a mutex any time any thread accessed it.
This is a big deal: non-atomic access to basic data types like int is a significant source of bugs in multi-threaded code, because control can be swapped between threads almost anywhere. Junior developers shouldn’t be allowed to write multi-threaded code in most languages because it is way too easy for them to miss this kind of detail, much less catch larger structural and timing issues. However, in Tcl it would be relatively safe for them to do so.
Thread Pools
There is also a Tcl module for creating and managing thread pools. A thread pool is a group of threads that are created together. A typical application is in a server context a bunch of server threads are created at start-up and wait to serve requests as they come in.
I was going to run through a simple example here, but in searching around I found there is already a nice example in a previous ActiveState blog. Setting up a multi-threaded server in Tcl: /blog/concurreny-tcl-weaving-threads.
Threading For Fallible Developers
People make mistakes. I know I do. It is a mark of a mature mind to recognize this and take reasonable steps to avoid situations where mistakes are likely. And this is the important thing about the Tcl thread package: it is the first time in twenty years that I’ve seen threads done in such a way that I would allow a fallible developer to use them on a routine project without being worried that I’d be chasing race conditions and deadlocks throughout the life of the codebase. Whereas for threads in C or C++ or Java I would want an extremely senior developer working on the multi-threaded parts of the code, especially at the design level, with Tcl threads I could say to an intermediate developer, “Do this with threads, don’t use mutexes or thread-shared variables” and I would be reasonably confident that fairly maintainable code would result.
In this world of increasingly scarce development resources and increasingly multi-core environments, this makes Tcl a language worth considering for a much wider range of projects than one might have otherwise thought.
Title image courtesy of charclam on Flickr under Creative Commons License.