In some recent TCP exper­i­ment that I car­ried in the user space, I faced the prob­lem of gen­er­at­ing rel­a­tively pre­cise timer events in an asyn­chro­nous way. Fur­ther­more, mul­ti­ple timers had to be main­tained simultaneously.

The timerfd mech­a­nism in Linux (man timerfd_create(2)) allows one to cre­ate dif­fer­ent types of timers that com­mu­ni­cate expiry events through a file descrip­tor. The stan­dard read(), select(), poll() func­tions can then be used to detect and process the expi­ra­tion noti­fi­ca­tions. As for the clock gran­u­lar­ity, with CLOCK_REALTIME and with High Res­o­lu­tion Timer (HRT) sup­port, I get very respectable 1.000000e-09 sec­onds on my sys­tem. To see this, you can com­pile (using the –lrt flag) and run the following :

#include 
#include 
#include 
 
int main(void)
{
    printf("System Clock Granularity %ld\n", sysconf(_SC_CLK_TCK));
 
    struct timespec gran;
    if (clock_getres(CLOCK_MONOTONIC, &gran) < 0) {
        perror("clock_getres()");
        return -1;
    }
    printf("CLOCK_MONOTONIC granularity %ld %ld\n", gran.tv_sec, gran.tv_nsec);
 
    return 0;
}

Both peri­odic and non-periodic timers can be cre­ated a timer obtained through timerfd_create(). This behav­ior is con­fig­urable via the iter­m­er­spec struc­ture passed as an argu­ment to the timerfd_settime() function:

    struct itimerspec timerSpec;
    memset(&timerSpec, 0, sizeof(timerSpec));
 
    timerSpec.it_value.tv_sec = sec;
    timerSpec.it_value.tv_nsec = nsec;
 
    timerSpec.it_interval.tv_sec = intsec;
    timerSpec.it_interval.tv_nsec = intnsec;

When­ever the it_value field has both its tv_sec and tv_nsec fields set to 0, the timer is effec­tively dis­armed and no future timer expiry events should take place.

Once a timer has been set, read() will report the num­ber of timer expi­ra­tions events that have occurred since the last call.

Epoll

Hav­ing the abil­ity to gen­er­ate timer events, wouldn’t be nice if we could only reg­is­ter a call­back func­tion that would get call auto­mat­i­cally from the out­side to han­dle any pro­cess­ing that would have to be done ?

Linux pro­vides the epoll func­tion­al­ity which appears much more ele­gant and easy to use that its tra­di­tional select() coun­ter­part. Even bet­ter maybe, it is said to be O(1) ver­sus O(n) : this claim is reported here then taken to Wikipedia and dis­cussed on Stack­over­flow.

One cool thing for sure is its abil­ity to deliver event noti­fi­ca­tions on the set of file descrip­tors that it watches under either edge or level-triggered modes. In the first case, epoll_wait() (nor­mally a block­ing call that you would place in a main­loop) will return as long as there is remain­ing data on one of its file descrip­tor. By con­trast, when one sets the EPOLLONESHOT flag, epoll_wait() will gen­er­ate only one event after which the asso­ci­ated file descrip­tor will be disabled.

Cook­ing up the final solution

Before going any fur­ther : yes libevent exists and also uses epoll. The point here is not to use it.

Back to the ini­tial goal of gen­er­at­ing timer expiry noti­fi­ca­tions asyn­chro­nously via call­backs, we only have to reg­is­ter the file descrip­tor asso­ci­ated with our instan­ti­ated timerfd ele­ments to epoll with epoll_ctl() and the EPOLL_CTL_ADD flag. Then run the epoll_wait() in a loop and you almost have a com­plete solution.

Two prob­lems then remain : how do you keep track of which call­back func­tion (together with the argu­ments to pass) goes with a given timer, and finally how do you avoid block­ing you appli­ca­tion on epoll_wait() ? The lat­ter can be eas­ily solved by wrap­ping our main­loop in a thread (with the nec­es­sary thread-safety pre­cau­tions): DONE. For the for­mer, epoll_ctl() delights us with its epoll_data_t union mem­ber of the epoll_event structure.

    typedef union epoll_data {
        void        *ptr;
        int          fd;
        uint32_t     u32;
        uint64_t     u64;
    } epoll_data_t;
 
    struct epoll_event {
        uint32_t     events;      /* Epoll events */
        epoll_data_t data;        /* User data variable */
    };

void *ptr” you said ? Yep that’s it : here’s our func­tion pointer. Since epoll_data is a union, we will prob­a­bly want to record more than one ele­ment under ptr. To do so, we can wrap all the infor­ma­tion we need under a struc­ture such as :

typedef struct {
    int tfd;
    void (*timed_action_handler)(void*);
    void* arg;
} timed_action_t;

Now when epoll_wait() unblocks we will be able to get access to the asso­ci­ated epoll_event and exe­cute the call­back pass­ing its arg argu­ment from the timed_action_t structure.

GitHub

You can check­out a sketch of the approach explained above at https://github.com/pierrelux/timedaction