“The workers are on fire again.”—Us, every day, before cmdstalk.
beanstalkd, PHP workers, fires
99designs pushes four million background jobs through beanstalkd each day. beanstalkd is a fantastic job queue which we’ve used for more than five years, via the pheanstalk client which I wrote in 2008.
Each beanstalkd job has a TTR; a timer which counts down during job processing. If TTR seconds elapse before the worker finishes the job, beanstalkd assumes the worker is dead and releases the job. Another one of our worker takes the job, despite the original worker still churning away. Each iteration of this results in greater load and less chance of this or any other job finishing. Eventually all the worker processes are stuck, and everything literally catches fire.
The workers are on fire again.
Increasing the TTR would mitigate the issue, but these EPS files seem subject to the halting problem. That leaves workers vulnerable to slow job saturation.
Interrupting the image operation when the job hits its TTR would be a better solution. But workers need concurrency to watch the TTR during the job. PHP doesn’t do threads, except via an extension that I’m disinclined to use. Using fork() would introduce IPC / signal handling complexity, and prevent processes sharing the beanstalkd connection. PHP feels like the wrong language to attack the problem.
Lachlan and I decided we could kill N birds with a single stone. One: solve the queue fires. Two: move another piece of our production infrastructure to Go. Three: provide a beanstalkd layer which our PHP, Ruby and Go apps could all use.
cmdstalk set out to harness the beanstalkd semantics we like on one end, and talk standard unix processes on the other. This allows us to write workers in any language. Here’s the basic model:
- Connect to a beanstalkd server, watch one or more tubes.
Pipe each job payload to a command specified by
- If the subprocess exits 0, delete the job; done.
- If the subprocess exits non-zero, release the job for retry (with backoff).
- If TTR elapses, kill the subprocess and bury the bad job.
Anything that can read
exit(int) can be a cmdstalk worker — no need for beanstalkd knowledge.
$ cmdstalk -help </code><code>Usage of cmdstalk: -address="127.0.0.1:11300": beanstalkd TCP address. -all=false: Listen to all tubes, instead of -tubes=... -cmd="": Command to run in worker. -per-tube=1: Number of workers per tube. -tubes=[default]: Comma separated list of tubes.
Our app runs cmdstalk under supervisord like this:
cmdstalk -all -per-tube=6 -cmd="/path/to/swiftly/console worker:stdin"
Go has become the go-to language at 99designs for infrastructure components. My only previous Go experience comes from writing go6502, an 8-bit computer emulator. Fascinating, but different to writing concurrent network applications. Despite that, building cmdstalk with Go was a pleasure.
Starting from the cmdstalk entrypoint you’ll see
cli packages loaded. cli/options.go demonstrates Go’s flag library for argument parsing. broker_dispatcher.go coordinates broker concurrency across tubes, and broker.go is where the action happens.
Broker.Run() is a clear candidate for refactoring, but when workers are burning, software’s better shipped than perfect.
ade6f6b0 introduces a simple
-all flag to watch all tubes at start-up.
431ac5fc evolves it to poll for new tubes as they’re created. The latter illustrates how well timers and concurrency come together in Go. Together they show that it’s simple to add functionality that would be complex in other languages.
cmdstalk applies a unix-process abstraction layer to beanstalkd job processing. Like any abstraction it needs to make itself worthwhile.
But if you need to process background jobs using several languages, some of them poorly suited to long-running daemons and concurrency, cmdstalk may be for you. Give it a try; feedback and pull requests are welcome.