Unix text tools
There are few standard text processing tools which are used very often on the Unix-like system.
- No regular expression is used:
- cat(1) concatenates files and outputs the whole content.
- tac(1) concatenates files and outputs in reverse.
- cut(1) selects parts of lines and outputs.
- head(1) outputs the first part of files.
- tail(1) outputs the last part of files.
- sort(1) sorts lines of text files.
- uniq(1) removes duplicate lines from a sorted file.
- tr(1) translates or deletes characters.
- diff(1) compares files line by line.
- Basic regular expression (BRE) is used:
- grep(1) matches text with patterns.
- ed(1) is a primitive line editor.
- sed(1) is a stream editor.
- vim(1) is a screen editor.
- emacs(1) is a screen editor. (somewhat extended BRE)
- Extended regular expression (ERE) is used:
- egrep(1) matches text with patterns.
- awk(1) does simple text processing.
- tcl(3tcl) can do every conceivable text processing: re_syntax(3). Often used with tk(3tk).
- perl(1) can do every conceivable text processing. perlre(1).
- pcregrep(1) from the
pcregreppackage matches text with Perl Compatible Regular Expressions (PCRE) pattern. - python(1) with the
remodule can do every conceivable text processing. See "/usr/share/doc/python/html/index.html".
The simple use of script(1) (see Section 1.4.9, “Recording the shell activities”) to record shell activity produces a file with control characters. This can be avoided by using col(1) as the following.Recording the shell activities cleanly
$ script Script started, file is typescript
Do whatever … and pressCtrl-Dto exitscript.
$ col -bx <typescript >cleanedfile $ vim cleanedfile
If you don't havescript(for example, during the boot process in the initramfs), you can use following instead.
$ sh -i 2>&1 | tee typescript
No comments:
Post a Comment