{"id":1067,"date":"2013-06-21T01:00:00","date_gmt":"2013-06-20T23:00:00","guid":{"rendered":"https:\/\/www.fussylogic.co.uk\/blog\/?p=1067"},"modified":"2013-08-28T11:09:13","modified_gmt":"2013-08-28T10:09:13","slug":"unixs-achilles-heel-i","status":"publish","type":"post","link":"https:\/\/www.fussylogic.co.uk\/blog\/?p=1067","title":{"rendered":"UNIX&#8217;s Achilles&#8217; Heel I"},"content":{"rendered":"<p>UNIX has one (to my mind) big hole in its API. Windows, surprisingly, doesn\u00e2\u20ac\u2122t share it. This article will discuss that gap, and how it can be worked around.<\/p>\n<h1 id=\"file-descriptor-wakes\">File Descriptor Wakes<\/h1>\n<p>There are a few multiplexing system calls, <code>select()<\/code>, <code>poll()<\/code>, and <code>epoll()<\/code> that put a thread to sleep, with the wakeup triggered by activity on particular file descriptors. The modern choice is <code>epoll()<\/code>, but classic examples always use <code>select()<\/code>. They\u00e2\u20ac\u2122re the same at their core.<\/p>\n<p>As well as monitoring an open file or socket, there are handy ways of making a descriptor that can be waited on in one thread, and triggered in another. For example, <code>eventfd()<\/code> and <code>pipe()<\/code>.<\/p>\n<p>UNIX signals can be coaxed into fitting this model with <code>signalfd()<\/code>, which removes some of the async signal handling problems that can plague systems software.<\/p>\n<p>Timers can be created to notify via a file descriptor using <code>timerfd_xxx()<\/code> family functions.<\/p>\n<p>We can monitor for filesystem changes with <code>inotify()<\/code>.<\/p>\n<p>All these combine to mean we can, by use of the multiplexing API, listen simultaneously for any mixture of these events, and handle each appropriately as it happens.<\/p>\n<h1 id=\"thread-wakes\">Thread Wakes<\/h1>\n<p>Many of the pthread library functions can put a thread to sleep, the core mutex, for example, will sleep until any other thread holding that lock releases it. In theory, that could be used as a wake mechanism. Better though is to use the system intended for event notification across threads \u00e2\u20ac\u201c the wait condition.<\/p>\n<p>A wait condition is really just a handle that one or more threads reference, and sleep on, and any other thread can trigger a wake.<\/p>\n<p>There is no native API for multiplexing wait conditions. You cannot wait for <em>conditionA<\/em> or <em>conditionB<\/em> in one call.<\/p>\n<h1 id=\"never-the-twain\">Never The Twain<\/h1>\n<p>The hole in UNIX is exemplified by a requirement like this:<\/p>\n<p><em>Run a thread that wakes up when a network connection is received, or when, say, a new command is pushed onto a command queue<\/em>.<\/p>\n<p>The problem is that there is no way to bridge wait conditions into the multiplexed APIs. You can\u00e2\u20ac\u2122t sleep on a file descriptor <em>or<\/em> wait condition; you can only sleep on one. Worse, as mentioned above, you cannot even wait on multiple wait conditions at once.<\/p>\n<h1 id=\"solution-1\">Solution #1<\/h1>\n<p>The <a href=\"http:\/\/stackoverflow.com\/questions\/8593004\/waiting-on-a-condition-pthread-cond-wait-and-a-socket-change-select-simultan\">common solution<\/a> is to abandon wait conditions and use <code>eventfd()<\/code> or <code>pipe()<\/code> as the event signalling mechanism.<\/p>\n<p>This will work but has the distinct drawback of using up a file descriptor \u00e2\u20ac\u201c a limited resource. It also means you lose a lot of flexibility, because you have to centralise that file descriptor so that it\u00e2\u20ac\u2122s known and monitored by the thread that wakes <em>and<\/em> the thread that sleeps. Your design becomes incapable of abstracting the supply of events \u00e2\u20ac\u201c you have to tie the event generator and the event receiver closely together.<\/p>\n<p>This makes it hard to write a library module that supplies events. It ends up having to know who its supplying events to.<\/p>\n<h1 id=\"solution-2\">Solution #2<\/h1>\n<p>My solution is one that I based on an obscure article I found by Sun, which now that Sun has gone, I can\u00e2\u20ac\u2122t find online. It talked about implementing a Windows-style <a href=\"http:\/\/stackoverflow.com\/questions\/8593004\/waiting-on-a-condition-pthread-cond-wait-and-a-socket-change-select-simultan\"><code>WaitFor<\/code> call<\/a>. The advantage that Windows has in this case is that file descriptors are perfectly acceptable event handles for it\u00e2\u20ac\u2122s <code>WaitFor<\/code> family. However, Sun\u00e2\u20ac\u2122s example was only about implementing the ability to wait on multiple conditions in parallel, not wait conditions and file descriptors. We\u00e2\u20ac\u2122ll discuss just that simple case first, and move on to mixing in file descriptors in a later article.<\/p>\n<h2 id=\"multiplexed-wait-conditions\">Multiplexed Wait Conditions<\/h2>\n<p>The key observation we need to make is that we really can only wait for one wait condition at a time on UNIX. Therefore we are never going to be able to call a single function that takes multiple handles created by <code>pthread_cond_init()<\/code> and wakes on any of them.<\/p>\n<p>The typical form of use of one of these wait conditions would be something like this:<\/p>\n<pre class=\"sourceCode C\"><code class=\"sourceCode c\"><span class=\"dt\">void<\/span> waitingThread() {\n    <span class=\"co\">\/\/ --- waiting thread<\/span>\n    pthread_cond_init( &amp;conditionXGreaterThan10, null );\n    <span class=\"co\">\/\/ ...<\/span>\n    pthread_mutex_lock( &amp;lockX );\n    <span class=\"kw\">if<\/span>( !(x &gt; <span class=\"dv\">10<\/span>) )\n        pthread_cond_wait( &amp;conditionXGreaterThan10, &amp;lockX );\n    pthread_mutex_unlock( &amp;lockX );\n}\n\n<span class=\"dt\">void<\/span> threadThatModifiesX() {\n    <span class=\"co\">\/\/ --- thread that modifies X<\/span>\n    pthread_mutex_lock( &amp;lockX );\n    x = newValueOfX;\n    <span class=\"kw\">if<\/span>( x &gt; <span class=\"dv\">10<\/span> )\n        pthread_cond_broadcast( &amp;conditionXGreaterThan10 );\n    pthread_mutex_unlock( &amp;lockX );\n}<\/code><\/pre>\n<p>Then of course, having been led down that path, we create another wait condition for monitoring an event on, say, <code>y == 0<\/code>. That all then collapses when we want one thread to also wake on either condition.<\/p>\n<p>We have to focus instead on what we want: the ability to wake on any of multiple events. The trick is the realisation that we can represent those multiple events with a single wait condition.<\/p>\n<pre class=\"sourceCode C\"><code class=\"sourceCode c\"><span class=\"dt\">void<\/span> waitingThread() {\n    pthread_cond_init( &amp;conditionXGreaterThan10, null );\n    pthread_cond_init( &amp;conditionYEquals0, null );\n    pthread_cond_init( &amp;conditionXorYEvent, null );\n    <span class=\"co\">\/\/ ...<\/span>\n    <span class=\"co\">\/\/ conditionXGreaterThan10 code as above<\/span>\n    <span class=\"co\">\/\/ ...<\/span>\n    pthread_mutex_lock( &amp;lockXY );\n    <span class=\"kw\">if<\/span>( !(x &gt; <span class=\"dv\">10<\/span> || y == <span class=\"dv\">0<\/span>) )\n        pthread_cond_wait( &amp;conditionXorY, &amp;lockXY );\n    pthread_mutex_unlock( &amp;lockXY );\n}\n\n<span class=\"dt\">void<\/span> threadThatModifiesX() {\n    pthread_mutex_lock( &amp;lockX );\n    pthread_mutex_lock( &amp;lockXY );\n    x = newValueOfX;\n    <span class=\"kw\">if<\/span>( x &gt; <span class=\"dv\">10<\/span> ) {\n        pthread_cond_broadcast( &amp;conditionXGreaterThan10 );\n        pthread_cond_broadcast( &amp;conditionXorYEvent );\n    }\n    pthread_mutex_unlock( &amp;lockXY );\n    pthread_mutex_unlock( &amp;lockX );\n}\n\n<span class=\"dt\">void<\/span> threadThatModifiesY() {\n    <span class=\"co\">\/\/ --- thread that modifies y<\/span>\n    pthread_mutex_lock( &amp;lockY );\n    pthread_mutex_lock( &amp;lockXY );\n    y = newValueOfY;\n    <span class=\"kw\">if<\/span>( y == <span class=\"dv\">0<\/span> ) {\n        pthread_cond_broadcast( &amp;conditionYEquals0 );\n        pthread_cond_broadcast( &amp;conditionXorYEvent );\n    }\n    pthread_mutex_unlock( &amp;lockXY );\n    pthread_mutex_unlock( &amp;lockY );\n}<\/code><\/pre>\n<p>I\u00e2\u20ac\u2122ve left the individual events in too as other threads might monitor them alone, but the important thing to note is that when we want to wait on multiple events we still only wait on one condition. The difference is that it\u00e2\u20ac\u2122s a composite condition, representing all of the possible wake conditions.<\/p>\n<p>This sort of code is obviously unwieldy, we would have to alter every bit of code that modifies a variable to tell it about every bit of code that includes a monitor of that variable \u00e2\u20ac\u201c worse the thread that modifies <code>x<\/code> has to know at least a little bit about <code>y<\/code> thanks to <code>lockXY<\/code>. Very bad for code isolation.<\/p>\n<p>The solution is to abstract away the multiple broadcasts and multiple locks so that neither side needs to know anything about the other. Clearly then there are two sides: the notifier and the waiter. Here\u00e2\u20ac\u2122s what we\u00e2\u20ac\u2122ll aim for (I\u00e2\u20ac\u2122m going to flip to C++ now as we\u00e2\u20ac\u2122ll get needlessly messy code without some object oriented facilities):<\/p>\n<pre class=\"sourceCode C\"><code class=\"sourceCode c\"><span class=\"co\">\/\/ --- global<\/span>\n<span class=\"dt\">int<\/span> x = <span class=\"dv\">0<\/span>;\n<span class=\"dt\">int<\/span> y = <span class=\"dv\">100<\/span>;\nTNotifier notifyXGreaterThan10;\nTNotifier notifyYEquals0;\n\n<span class=\"dt\">void<\/span> waitingThread() {\n    TEventWaiter EW;\n    EW.waitfor( notifyXGreaterThan10 );\n    EW.waitfor( notifyYEquals0 );\n    EW.sleep();\n\n    <span class=\"co\">\/\/ ... interrogate EW for waker ...<\/span>\n}\n\n<span class=\"dt\">void<\/span> notifyingThread1() {\n    <span class=\"co\">\/\/ --- notifying thread #1<\/span>\n    notifyXGreaterThan10.lock();\n    x = newValueOfX;\n    <span class=\"kw\">if<\/span>( x &gt; <span class=\"dv\">10<\/span> )\n        notifyXGreaterThan10.broadcast();\n    notifyXGreaterThan10.unlock();\n}\n\n<span class=\"dt\">void<\/span> notifyingThread2() {\n    <span class=\"co\">\/\/ --- notifying thread #2<\/span>\n    notifyYEquals0.lock();\n    y = newValueOfX;\n    <span class=\"kw\">if<\/span>( y &gt; <span class=\"dv\">10<\/span> )\n        notifyYEquals0.broadcast();\n    notifyYEquals0.unlock();\n}<\/code><\/pre>\n<p>Notice that the <code>x<\/code>-modifying thread no longer needs to know anything about <code>y<\/code>; notice also that we\u00e2\u20ac\u2122re back to having the original simple appearance that the first single-condition example had.<\/p>\n<p>It\u00e2\u20ac\u2122s all very well to show the end point, how do we do it though? We invert the control \u00e2\u20ac\u201c instead of having a wait condition per condition, we have a wait condition per waiter, and attach as many events to it as we want.<\/p>\n<p>The following is nowhere near complete, nor even thread-safe, but it captures the core idea.<\/p>\n<pre class=\"sourceCode C\"><code class=\"sourceCode c\">class TEventWaiter {\n  public:\n    TEventWaiter() :\n        mEvent(PTHREAD_COND_INITIALIZER),\n        mConditionListLock(PTHREAD_MUTEX_INITIALIZER)\n    {\n    }\n\n    ~TEventWaiter() {\n        <span class=\"co\">\/\/ ... iterate through condition list and unsubscribe<\/span>\n    }\n\n    bool sleep() {\n        pthread_mutex_lock( &amp;mConditionListLock );\n        pthread_cond_wait( &amp;mEvent, &amp;mConditionListLock );\n        pthread_mutex_unlock( &amp;mConditionListLock );\n    }\n\n    <span class=\"dt\">void<\/span> waitfor(TNotifier *notifier) {\n        pthread_mutex_lock( &amp;mConditionListLock );\n        mConditions.push_back(notifier);\n        notifier-&gt;subscribe(this);\n        pthread_mutex_unlock( &amp;mConditionListLock );\n    }\n\n    <span class=\"dt\">void<\/span> broadcast( <span class=\"dt\">const<\/span> TNotifier * ) {\n        pthread_cond_broadcast( &amp;mEvent );\n    }\n\n  protected:\n    pthread_cond_t mEvent;\n    mutable pthread_mutex_t mConditionListLock;\n    list&lt;TNotifier*&gt; mConditions;\n};\n\nclass TNofifier {\n  public:\n    TEventWaiter() :\n        mSubscriberListLock(PTHREAD_MUTEX_INITIALIZER)\n    {\n    }\n\n    <span class=\"co\">\/\/ Tell every subscriber the event has occurred<\/span>\n    <span class=\"dt\">void<\/span> broadcast() <span class=\"dt\">const<\/span> {\n        pthread_mutex_lock( &amp;mSubscriberListLock );\n        list&lt;TEventWaiter*&gt;::const_iterator it;\n        <span class=\"kw\">for<\/span>( it = mSubscribers.begin(); it != mSubscribers.end(); it++ ) {\n            it-&gt;broadcast( this );\n        }\n        pthread_mutex_unlock( &amp;mSubscriberListLock );\n    }\n\n    <span class=\"co\">\/\/ Allow a waiter to subscribe to us<\/span>\n    <span class=\"dt\">void<\/span> subscribe(TEventWaiter* subscriber) {\n        pthread_mutex_lock( &amp;mSubscriberListLock );\n        mSubscribers.push_back(subscriber);\n        pthread_mutex_unlock( &amp;mSubscriberListLock );\n    }\n\n  protected:\n    mutable pthread_mutex_t mSubscriberListLock;\n    list&lt;TEventWaiter*&gt; mSubscribers;\n};<\/code><\/pre>\n<p>This is an extraordinarily sparse implementation as I\u00e2\u20ac\u2122m already well passed my quota for code in a blog post, I\u00e2\u20ac\u2122ll get around to a fuller, more general implementation and will post it on my github page eventually (emails to prod me when I forget are welcome).<\/p>\n<p>The important point to note is that these two objects provide, effectively, two table joins (although there is no database of course). One maintains the list of events that can wake it, and the other maintains the list of listeners that should be woken. That is what gives us our strong isolation When a waiter subscribes to an event it does not care nor know if there are fifty others waiting on the same condition. When a supplier of a notification triggers a wake it does not care if one of its waiters is waiting for fifty other events too.<\/p>\n<p>I\u00e2\u20ac\u2122m conscious now that I\u00e2\u20ac\u2122ve already (as is my habit) written too much for a single post. I\u00e2\u20ac\u2122ll return to this next time when I\u00e2\u20ac\u2122ll discuss using this technique to blend wait conditions and file descriptor multiplexing. We\u00e2\u20ac\u2122ll also ditch the <code>pthreads<\/code> dependence, and use the nice new C++11 concurrency libraries instead.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>UNIX has one (to my mind) big hole in its API. Windows, surprisingly, doesn\u00e2\u20ac\u2122t share it. This article will discuss that gap, and how it can be worked around. File Descriptor Wakes There are a few multiplexing system calls, select(), poll(), and epoll() that put a thread to sleep, with the wakeup triggered by activity\u2026 <span class=\"read-more\"><a href=\"https:\/\/www.fussylogic.co.uk\/blog\/?p=1067\">Read More &raquo;<\/a><\/span><\/p>\n","protected":false},"author":2,"featured_media":0,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":[],"categories":[1],"tags":[42,6,81,80],"_links":{"self":[{"href":"https:\/\/www.fussylogic.co.uk\/blog\/index.php?rest_route=\/wp\/v2\/posts\/1067"}],"collection":[{"href":"https:\/\/www.fussylogic.co.uk\/blog\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.fussylogic.co.uk\/blog\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.fussylogic.co.uk\/blog\/index.php?rest_route=\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/www.fussylogic.co.uk\/blog\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=1067"}],"version-history":[{"count":6,"href":"https:\/\/www.fussylogic.co.uk\/blog\/index.php?rest_route=\/wp\/v2\/posts\/1067\/revisions"}],"predecessor-version":[{"id":1205,"href":"https:\/\/www.fussylogic.co.uk\/blog\/index.php?rest_route=\/wp\/v2\/posts\/1067\/revisions\/1205"}],"wp:attachment":[{"href":"https:\/\/www.fussylogic.co.uk\/blog\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=1067"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.fussylogic.co.uk\/blog\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=1067"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.fussylogic.co.uk\/blog\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=1067"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}