Skip to content

Introduction to Pipes in UNIX systems

Unix pipes provide a mechanism for inter-process communication, allowing the output of one process to be fed as input to another process. This makes them a powerful tool for connecting different processes and creating data flows between them.

bash
# Counting the lines in the `file.txt` file
$ cat file.txt | wc -l

If you have no idea what is happening in the command above, you will know exactly after this article. Read on to find out more.

Explanation of what pipes are

This is the Unix philosophy: Write programs that do one thing and do it well. Write programs to work together. Write programs to handle text streams, because that is a universal interface.

Pipes enable the above statement to be alive. Pipes are a feature in the Unix operating system that allow for interprocess communication (IPC) by creating a one-way communication channel between two processes. They facilitate the transfer of data between these processes, providing a convenient way to pass the output of one process as the input of another.

Most basic UNIX commands (programs)

cd - change directory

ls - lists your files

mv filename1 filename2 - moves a file (or renames it)

cp filename1 filename2 - copies a file

rm filename - removes a file

wc filename - tells you how many lines there are in a file.

The implementation of pipes in the Unix kernel involves the creation of a special kind of file, called a pipe file, which acts as a bridge between the processes involved. This pipe file is represented by an inode, a data structure that contains information about the file, such as its size, permissions, and location.

When a pipe is created, two file descriptors are returned: one for the read end of the pipe and another for the write end. The processes involved can then use these file descriptors to read from and write to the pipe, respectively. The read and write ends of the pipe are connected, allowing data to be transferred between the processes.

The functionality of pipes in the Unix kernel includes buffering the data transferred between the processes, ensuring that it is temporarily stored until it can be read by the receiving process. Pipes have a fixed capacity, and when this capacity is reached, the writing process is blocked until there is sufficient space in the pipe to write more data.

Standard Output and Pipes

Standard output refers to the default output device in a Linux system, typically the terminal. It is the stream of data that is produced by a command or program when it is executed. This output can be redirected to other commands or files using the concept of pipes.

bash
# `echo` program prints to standard output whatever is provided as the argument
$ echo 1;

Pipes, denoted by the "|" symbol, are used to connect the standard output of one command to the standard input of another command. This allows for a seamless flow of data between commands, enabling complex operations to be performed. Instead of displaying the output on the terminal, it can be passed directly as input to another command, allowing for further processing or customization.

bash
# `cat` prints the contents of a file
$ cat file.txt

For example, consider the command ls, which lists the files and directories in the current directory. By appending | grep 'txt', the output of the ls command is filtered through the grep command, which searches for lines containing the specified pattern, in this case, "txt". The final result will be a list of only the text files in the current directory.

In Linux systems, there are two types of pipes available: unnamed pipes and named pipes. Unnamed pipes are created automatically when needed, and they exist only for the duration of the connection between two commands. Named pipes, also known as FIFOs, are special files that are created using the "mkfifo" command and can persist beyond the scope of a single connection.

bash
$ ls | grep .txt

In this example, the standard output of the ls command, which shows all files in the directory, is redirected to the grep command using the pipe operator. The grep command then filters the output to only show files with the .txt extension.

Alternatively, the standard output can be redirected to a file using the redirection operator (>), followed by the file name. For example, to save the output of the ls command to a file called "filelist.txt", you could use the following command:

bash
$ ls > filelist.txt

This command redirects the standard output of the ls command to the file "filelist.txt" instead of displaying it on the screen.

Redirecting the standard output with pipes or to a file allows for further processing or storage of the output, providing flexibility and enabling automation in command-line operations.

In conclusion, standard output and pipes are integral features of Linux systems that allow for the redirection of command output, enabling the customization and manipulation of data through the seamless flow of information between commands.

Named pipes

Named pipes serve as a method of interprocess communication, facilitating the exchange of data between two or more processes. In contrast to unnamed pipes, named pipes possess a supporting file and a distinct API, enhancing their versatility and capabilities.

Unlike standard pipes, named pipes persist across system reboots. This allows processes launched after the creation of a named pipe to connect and communicate with it seamlessly. This is useful for scenarios like logging: A daemon process can continuously write logs to a named pipe, while a separate log viewer process, potentially started later, can read from the same named pipe for real-time logging. However, be aware, contents are not persisted between reboots as well as there are limits to how much data can reside in memory when using a named pipe!

Setup named pipe

To establish a named pipe through the command line, the mkfifo command is utilized, followed by the chosen name for the pipe. For instance, to create a named pipe named "my_pipe" execute the command mkfifo my_pipe. This command generates a specialized file within the file system, acting as a conduit for communication between processes.

Once the named pipe is established, it becomes operational by opening it in one process for writing and another for reading. In one terminal window, data can be written to the named pipe using a command such as echo "Hello, World!" > my_pipe. Simultaneously, in another terminal window, the data can be read from the named pipe using a command like cat my_pipe.

Named pipes prove valuable when disparate processes need to communicate, particularly when they are not directly connected or running concurrently. They offer a dependable and effective means of data exchange between processes, fostering collaboration and coordination in various scenarios.

Creating a named pipe:

bash
$ mkfifo my_pipe

Writing data to the named pipe:

bash
$ echo "Hello, World!" > my_pipe

Reading data from the named pipe:

bash
$ cat my_pipe

Named pipes caveats

As much as named pipes sound handy and nice, there are a few intrinsic details you need to be aware of. Those pose a limitation to many use cases, so it's good to know them:

  • A pipe has a capacity (depends on the system, but usually it's 64kb) which means writes will be blocked until the contents are read by other process
  • There is a write buffer, if exceeded, the write operation will not be atomic
  • You cannot seek within pipe contents
  • When there are multiple reading processes, the content is not duplicated
  • There are no guarantees for multiple writing processes

Meet Logdy

Logdy is a versatile DevOps tool designed to enhance productivity in the terminal. Operating under the UNIX philosophy, Logdy is a single-binary tool that requires no installations, deployments, or compilations. It works locally, ensuring security, and can be seamlessly integrated into the PATH alongside other familiar commands like grep, awk, sed, and jq. It is particularly beneficial for professionals such as software engineers, game developers, site reliability engineers, sys admins, and data scientists who frequently work with terminal logs or outputs.

Logdy records the output of processes, whether from standard output or a file and directs it to a web UI. The web UI served on a specific port by Logdy, provides a reactive, low-latency application for browsing and searching through logs. It supports various use cases, such as tailing log files, integrating with applications (e.g., node.js, Python scripts, Go programs, or anything else that produces standard output), and tools like kubectl, docker logs etc.

One notable feature is its hackability with TypeScript, allowing users to filter, parse, and transform log messages by writing TypeScript code directly within the browser. This hackability provides flexibility to express custom logic without delving into the intricacies of other command-line tools. Overall, Logdy offers a convenient and efficient solution for managing and analyzing terminal logs.

How Logdy works with pipes?

Logdy is designed to consume all of the process standard output. You can pair Logdy with any command that produces output to either standard output or error.

bash
# use with any shell command
$ tail -f file.log | logdy