What is an event loop is a common question. But have you wondered why we need need one?
If you’re a software developer at The Magic Of Coding, your task for this sprint might look like this:
Find the best way to process concurrent I/O
operations in a single thread
Let’s explore why an event loop is crucial for this task by examining a concrete example. Suppose we have two files, file A
and file B
.
For each file we need to:
1. Read data from the file using a system call
2. Call a function processData(data)
with the data read
What are our options?
What is an I/O operation?
For instance,
1. Reading a file, transfers data from the RAM or the hard disk to the program.
2. A server, sending back a HTTP response with a JSON payload, transfers data from the server to the client.
3. Reading data from a socket, transfers data from the socket to the program.
I/O is the slowest fundamental operation that a computer performs. Hence, it adds a time delay between the moment the request is sent and the moment the operation completes.
The solutions below, differ in what the thread does during this delay.
Option 1 – Use blocking I/O
When a program makes a blocking function call, the call does not return until the operation is completed. For instance, in a blocking call like, file.read()
below,
data = file.read() // blocks the thread until the data is available
print(data)
In this case, print(data)
won’t execute until the data is read, which might take several seconds. If it takes 5s for the file to be read, line 2 executes after 5s.
Using blocking function calls, our solution to reading and processing data from two files would look like,
dataFromFileA = fileA.read() // executes at time T0 seconds
processData(dataFromFileA)
dataFromFileB = fileB.read() // executes at time T5s
processData(dataFromFileB)
Here, the read on fileB
has to wait 5 seconds until
returns data. It is trivial to notice that a single thread will not be able to handle simultaneous I/O operations using blocking calls. Subsequent calls will need to wait for the preceding calls to finish.file
A
What we need is a solution where
also executes at the T0th second.file
B.read()
Option 2 – Use non-blocking I/O
Operating systems support another mechanism to access resources called non-blocking I/O. In non-blocking mode, function calls always return immediately, without waiting for data to be read or written.
If no results are available, the function will simply return a predefined constant, indicating that there is no data available to return at that moment.
The fcntl()
system call
fcntl()
is used to manipulate an existing file descriptor for a resource to change its operating mode to non-blocking (with the O_NONBLOCK
flag).Once the resource is in non-blocking mode, any read operation will fail with a return code,
EAGAIN
, in case the resource doesn’t have any data ready to be read.The most basic pattern for accessing this kind of non-blocking I/O is to actively poll the resource until some data is returned; this is called busy-waiting.
Our solution using non-blocking I/O would look like,
resources = [fileA, fileB];
while (!resources.isEmpty()) {
for (i = 0; i < resources.length; i++) {
data = resource[i].read();
if (data === NO_DATA_AVAILABLE)
continue; // there is no data to read at the moment
if (data === RESOURCE_CLOSED)
resources.remove(); // remove from list if resource was closed
else
processData(data);
}
}
With this, it is possible to handle multiple resources requiring I/O operations in a single thread, but it is highly inefficient.
The loops will consume precious CPU cycles for iterating over resources that are unavailable most of the time. Polling algorithms usually result in a huge amount of wasted CPU time.
If it takes 5s for data to be available for reading on the files, then the loops will iterate constantly for 5s without doing anything useful.
Option 3 – Use event demultiplexing
Operating systems provide another native mechanism to handle concurrent, non-blocking resources efficiently. This mechanism is called synchronous event demultiplexer or event notification interface.
This component collects and queues I/O events that come from a set of watched resources, and block until new events are available to process.
Applications register resources and events they are interested in with the event notifier. When the event occurs, the event notifier notifies the application.
A solution using event demultiplexer would look like,
watchedList.add(fileA, FOR_READ); // [1]
watchedList.add(fileB, FOR_READ);
while (events = demultiplexer.watch(watchedList)) { //[2]
// event loop
foreach (event in events) { // [3]
data = event.resource.read() // This read will never block and will always return data
if (data === RESOURCE_CLOSED)
demultiplexer.unwatch(event.resource) // remove from watched list if resource was closed
else
processData(data)
}
}
- Each resource, and the associated event that we’re interested in, is added to a data structure.
- Event notifier is told to watch these resources for the registered events. This call is synchronous and blocking. It only returns when the events occur. Line 6 only executes when this call returns.
- Each event is processed. At this point, the resource associated with each event is guaranteed to be ready to read and will not block during the operation. When all events are processed, the code will block again on line 4 until new events are available to process. This is called the event loop.
This solves the issues with previous options,
- At T0s, events are registered for both files. At some point in the future, read events are returned for both files and a non-blocking read is performed which is guaranteed to return data.
- Since the call to the notifier is blocking, the loop only executes when events are available. So CPU cycles are not wasted in trying to read resources that are not yet ready.
We can see that with this approach, a single thread can handle multiple I/O operations concurrently.
The non-blocking I/O engine of Node.js – libuv
epoll
on Linux, kqueue
on Mac OS X and IOCP
on Windows. libuv provides a layer of abstraction over these inconsistent interfaces and offers a uniform, platform independent interface to Node.js.
The spread of computers and the Internet will put jobs in two categories. People who tell computers what to do, and people who are told by computers what to do.
Marc Andreessen
If you have any questions, please don’t hesitate to leave a comment below. We’ll get back to you as soon as possible.
Super