While learning about and configuring weechat this evening, I noticed a lot of complexity and unsatisfying tradeoffs related to its UI, its mouse support, and its built-in window system. Got to wondering what I'd do differently, if I wrote my own IRC client, to avoid those problems.
The first thing I realized is, it is not a good idea to think about writing your own IRC client. Danger will robinson..
So, let's generalize. This blog post is not about designing an IRC client, but about exploring simpler ways that something like an IRC client might handle its UI, and perhaps designing something general-purpose that could be used by someone else to build an IRC client, or be mashed up with an existing IRC client.
What any modern IRC client needs to do is display various channels to the user. Maybe more than one channel should be visible at a time in some kind of window, but often the user will have lots of available channel and only want to see a few of them at a time. So there needs to be an interface for picking which channel(s) to display, and if multiple windows are shown, for arranging the windows. Often that interface also indicates when there is activity on a channel. The most recent messages from the channel are displayed. There should be a way to scroll back to see messages that have already scrolled by. There needs to be an interface for sending a message to a channel. Finally, a list of users in the channel is often desired.
Modern IRC clients implement their own UI for channel display, windowing, channel selection, activity notification, scrollback, message entry, and user list. Even the IRC clients that run in a terminal include all of that. But how much of that do they need to implement, really?
Suppose the user has a tabbed window manager, that can display virtual terminals. The terminals can set their title, and can indicate when there is activity in the terminal. Then an IRC client could just open a bunch of terminals, one per channel. Let the window manager handle channel selection, windowing (naturally), and activity notification.
For scrollback, the IRC client can use the terminal's own scrollback buffer, so the terminal's regular scrollback interface can be used. This is slightly tricky; can't use the terminal's alternate display, and have to handle the UI for the message entry line at the bottom.
That's all the UI an IRC client needs (except for the user list), and most of that is already implemented in the window manager and virtual terminal. So that's an elegant way to make an IRC client without building much new UI at all.
But, unfortunately, most of us don't use tabbed window managers (or tabbed terminals). Such an IRC client, in a non-tabbed window manager, would be a many-windowed mess. Even in a tabbed window manager, it might be annoying to have so many windows for one program.
So we need fewer windows. Let's have one channel list window, and one channel display window. There could also be a user list window. And there could be a way to open additional, dedicated display windows for channels, but that's optional. All of these windows can be seperate virtual terminals.
A potential problem: When changing the displayed channel, it needs to output a significant number of messages for that channel, so that the scrollback buffer gets populated. With a large number of lines, that can be too slow to feel snappy. In some tests, scrolling 10 thousand lines was noticiably slow, but scrolling 1 thousand lines happens fast enough not to be noticiable.
(Terminals should really be faster at scrolling than this, but they're still writing scrollback to unlinked temp files.. sigh!)
An IRC client that uses multiple cooperating virtual terminals, needs a way to start up a new virtual terminal displaying the current channel. It could run something like this:
x-terminal-emulator -e the-irc-client --display-current-channel
That would communicate with the main process via a unix socket to find out what to display.
Or, more generally:
x-terminal-emulator -e connect-pty /dev/pts/9
connect-pty
would simply connect a pty device to the terminal, relaying
IO between them. The calling program would allocate the pty and do IO to
it. This may be too general to be optimal though. For one thing, I think
that most curses libraries have a singleton terminal baked into them, so it
might be hard to have a single process control cursors on multiple pty's.
And, it might be innefficient to feed that 1 thousand lines of scrollback
through the pty and copy it to the terminal.
Less general than that, but still general enough to not involve writing an IRC client, would be a library that handled the irc-client-like channel display, with messages scrolling up the terminal (and into the scrollback buffer), a input area at the bottom, and perhaps a few other fixed regions for status bars and etc.
Oh, I already implemented that! In concurrent-output, over a year ago: a tiling region manager for the console
I wonder what other terminal applications could be simplified/improved by using multiple terminals? One that comes to mind is mutt, which has a folder list, a folder view, and an email view, that all are shoehorned, with some complexity, into a single terminal.
i have a similar concern in a project i'm working on now, which is to run a bunch of tests in one window and showing system logs in another window for convenience. i thought i would use tmux for this because the commandline interface is simple enough to be used like an API. my previous approach to this was to use a multi-interface design with simple functions like "alert", "prompt" or "log" that would send stuff to different places (terminal or gtk windows) depending on the environment, a bit like what debconf is doing, with all the crap that implies (generally: a suboptimal interface everywhere).
as for an irc client, you may be curious to try out
ii
: it's a simple FIFO and filesystem-based irc client that implements basically no user interface... a little nuts, but sounds like it would be the backend to the UI you are describing.Firstly: yes, please :-) I've been using nmh for email lately and I love that I can open up a terminal and find something in my email archive for example, without disturbing the email I'm composing, or my inbox emptying process. Being able to connect multiple terminals is somewhat analogous to opening multiple tabs in the browser to the same webapp.
Second, odd that so many terminal emulaters are so slow. I run st (suckless terminal) and it seems to be able to handle about 100,000 lines of output per second (my notes file a few times over, no escape codes but some unicode) on my 6 year old desktop.
Third, most tiling window managers wreak havoc on existing output in terminals, even if they only resize it vertical dimension of the terminals. Tabbed should work fine though.
I like the idea of unifying terminals, editors, IRC clients, mail clients, and other things that need multiple windows (whether tiled, tabbed, or just switched between). I don't know that I'd want to use terminal scrollback for that (I prefer editor-style or screen-style buffers with scroll positions), but many aspects of all those programs overlap. I can see why people use Emacs for everything, and I hope to see vim/neovim go in the same direction now that both have introduced coprocess support.
Editor buffers work nicely as a generalization, but they don't fit perfectly. For instance, embedding a shell in an editor needs to disable some of the editing functions for part but not all of the buffer. You want to use the editor to edit the next command line, but treat the prompt before it as read-only. (But you can use editor functions to search and copy text.)
I'd love to have a "libtmux" or "libscreen" that applications can embed. But I'd also like those applications to cooperate within a single thing that behaves like screen/tmux/vim, not to each have their own. What would such an abstraction look like?
What about a multi-pty-management library that can delegate to an outer multi-pty-management application. Imagine a library that lets you manage a pile of independent terminal-like buffers, and manage them like screen/tmux within a normal containing terminal. However, that library can also recognize when run inside an application that can handle such buffers itself (such as via an environment variable providing a communication socket). In that case, it'll contact that application, and delegate pty management to that application. The application can then logically group the buffers from each application together, and provide various functions for navigation, tiling, tabbing, switching, scrolling, searching, etc.
Trying
st
with my ad-hoc benchmark, it managed to scroll around 4 thousand lines without noticable flicker. But I don't thinkst
has a scroll buffer. Runningscreen
in it to buffer the scrollback slowed it down to around 1-2 thousand lines, so comprable to vte.@josh interesting thoughts..
A pty-management library does seem like a good idea. There could be different pty-manager commands that open xterms, open screen buffers, etc. So the user runs something like:
And irc-client could also be run in any shell that was started inside a pty-manager, talking to the outer pty-manager.
When the irc-client opens a pty, in-xterms would need to use something like
xterm -e connect-pty
to open a new xterm connected back to that pty. But when the irc-client runs the mail-client, in-xterms can just runxterm -e mail-client
and doesn't need to relay.(Leaving aside for the momemnt the problem that xterm -e does not propigate the exit status of the program it runs to its caller..)
While connect-pty would have some copying and syscall overhead (probably similar to
screen
), if a program found that excessive, it could run a child process, which would be connected up directly to the terminal (in the in-xterms case). That also provides a workaround for the singleton termnal in curses libraries.As well as an environment variable containing the socket of the pty-manager, there could be one containing the address of the pty. If an irc-client opens a new pty to run a mail client, the mail client would see an address of eg "irc-client.mail-client", and it can use that address when it opens a new pty for an editor, so the editor gets an address of eg "irc-client.mail-client.editor". The pty manager can see a whole tree:
Such human-friendly names in a tree would be useful for interactively managing ptys and could also inform tiling layouts. It also lets the pty manager set the default window title/screen buffer name to something reasonable.
The C API could look like this:
Compare with openpty(3) and forkpty(3).
Here
addr
is the address of the pty, eg "irc-client.channel.#haskell", and any parent address is automatically prefixed to it.The implementation of these would use the socket environment variable to send requests to the pty manager. Mm, fd passing..
The winsize parameter lets a preferred window size be selected (to open an xterm with that size, or tile a buffer at that size).
It would probably be a good idea to add a parameter to both API calls for additional hints. For example, a pty manager might be able to lay out one buffer to the left of an other one, if provided a hint to do so. Or a pty might be non-interactive, and so a pty manager should not let the focus move to it. This needs to be something that's extensible. Could be as simple as
char *const managehints[]
i have a similar concern in a project i'm working on now, which is to run a bunch of tests in one window and showing system logs in another window for convenience. i thought i would use tmux for this because the commandline interface is simple enough to be used like an API. my previous approach to this was to use a multi-interface design with simple functions like "alert", "prompt" or "log" that would send stuff to different places (terminal or gtk windows) depending on the environment, a bit like what debconf is doing, with all the crap that implies (generally: a suboptimal interface everywhere).
as for an irc client, you may be curious to try out
ii
: it's a simple FIFO and filesystem-based irc client that implements basically no user interface... a little nuts, but sounds like it would be the backend to the UI you are describing.