I have an important decision to make about moreutils, my collection of new unix tools. The question is whether to stand firm behind the idea of only adding tools to moreutils that are truely unique and not available in similar form in other packages, or whether to collect good tools to fill in missing gaps in the standard unix toolset, even if they are similar to already existing, but possibly hard to find tools.

So far I've had an easy time deciding to reject some tools like add, todist and tostats, which are somewhat special purpose and which turn out to already be mostly implemented in the numutils package. And I'm also fairly comfortable with Lars's and my decision to not include mime, which is similar to File::MimeInfo's mimetype program, if only because implementing that requires a lot of code or a long dependency chain.

The decision is harder for things like shuffle/unsort, which both fill in the gap of a file randomisation tool. There already are packages in Debian (bogosort and randomise-lines) that provide this functionality. Another example is srename, which is similar to the rename program hidden in perl, which is insanely useful and rather underused.

It's also tricky since I could well be missing existing tools that overlap with moreutils. For example I just learned of the renameutils package, which contains a qmv. That is close to the same thing as vidir, although limited to filename removals (no deletions), and rather more complex, with an interactive command line mode. I've decided that it's worth keeping vidir despite this, if only for the delation support and the possible broader scope later, but it does point to a more general issue.

I've already concluded earlier that:

Maybe the problem isn't that no-one is writing them, or that the unix toolspace is covered except for specialised tools, but that the most basic tools fall through the cracks and are never noticed by people who could benefit from them.

One way that tools fall between the cracks, after all, is by being spread amoung lots of little obscure packages like File::MimeInfo, bogosort, randomise-lines, odd corners of perl, mmv, renameutils, etc. You might know about some of these packages, but you probably didn't know about all of them -- I know I didn't. I suspect that the authors of some of them didn't know about others and duplicated work.

A single package that collected good generic, consistent, and simple implementations of tools, even if some of them were not unique to it, could benefit from getting more attention than lots of small packages like those, and both help users find out about these tools as well as focus development energy on them to make them better.

So which is more important, focusing on collecting and writing unique tools or promoting slightly less unique tools?

discussion