translate translate words or fragments
doc generated from the script with gendoc
ruby script, version=1.23


translate [options] [filename]


print this help and exit
print full documentation and exit
print version and exit
-c,--comma[=list separator]
string separating the regexps/replacements
in the --old, --new and --pairs option; comma initially, empty by default
file with tab-separated regexps/words
edit files in place
if STRING given, save backups with STRING appended STRING must start with . or ~
take regexps as literal strings
allow multiline expressions
comma-separated replacements for the regexps
comma-separated regexps to be translated
comma-separated pairs of regexps and their replacements
separator between regexps/words in --from file
tab initially and by default
do a test run on the script's DATA section
run verbosely
only translate if matching string occurs
between word boundaries


In ASCII or UTF-8 encoded texts, translate converts words or text fragments into other words or text fragments.

The words/fragments to be translated are given as an array of regular expression / translation (regexp,translation) pairs, and there are three ways to provide these pairs:

These three alternatives are evaluated in the given order. More details about these options are given in the Options section.


prints help information, then quits.
prints full documentation, then quits.
prints version and then quits.
If you need to translate from or to comma-containing strings, you can make translate to split strings on this <i>string</i>, instead of on comma's. If you set it to an empty string, or use it without an argument, no splitting is performed and the whole --old or --new string is translated from or to. If you use the --pairs option, it's argument will be split in characters, so that single characters can be converted into another set of single characters.
Files to be translated are edited in place. If string is given, a copy of the original file is saved in a file with same name, with string appended; string must start with ~ or .,
Specifies file to contain tab-separated word/fragment - translation pairs, one per line. With the --tab option the tab separation character can be changed into another character or string. Translations given on the command line are performed before those given in a file.
The regexps are interpreted as normal strings; without it, if you use the option --pairs␣'a+','b+', every occurrence of one or more a's will be translated into b+. The --literal option prevents this - it causes special characters in regexps to be escaped.
The expression may contain newline characters. This implies that the file is not handled line by line, but as a whole, which can cause mamory allocation problems for extremaly large files. With the --literal option, the sequence \n is seen as a newline - it will not be escaped.
Comma separated list of translations for the words/fragments given with the --old option. If this list is shorter than the --old list, missing translations are set to the empty string, thus effectively deleting the corresponding words/fragments. String to be deleteted must then come at the end of the --old list, of course. Alternatively, you can explicitly use empty strings; for example, remove John and Mary and traslate Pete to Peter: translate --old=John,Pete,Mary --new=,Peter, or, using the --pairs option: translate --pairs=John,,Pete,Peter,Mary,
Comma separated list of strings, which are interpreted as Ruby regexps to be replaced. When the --literal option is used, the strings are taken literally, not as regexps.
Comma separated list of regexp,translation pairs: a merge of the --old and --new options.
Word-pair separator string for word-pairs in the word-pair file. Default is the tab-character.
Run a test by interpreting the DATA section of the translate script. May be useful to see further examples.
print debugging information.
translated texts must occur between word boundaries. Word boundaries are characters matching [a-zA-Z_] plus line boundaries; thus the command: translate --old=test --word will delete every word "test" but will leave a word like "testing" untouched.


In file.tex, convert \chapter to \section and \section to \subsection, saving the original in file.tex~:

   translate --old='\section,\chapter' \
           --new='\subsection,\section' \
           --literal -i~ file.tex

Note the order in which the words are given; reversing the order would turn both \chapter and \section into \subsection. The --literal option was used here, because in a regexp, \s would be interpreted as a whitespace character, which is clearly not what we want here. As an alternative, the same result can be produced without the --literal option by escaping the regexp:

   translate --old='\\section,\\chapter' \
             --new='\\subsection,\\section' \
             --inplace=.bak file.tex

Exchanging the occurrencies of John and Bill is a little tricky. The following would change both to John:

   translate -o John,Bill -n Bill,John testfile

We need an extra pair here:

   translate -o John,Bill,SomeWeirdString -n SomeWeirdString,John,Bill

In standard input, change every occurrence of "John, Bill" into "Anny" and write the result to standard output; the comma as a list separator must be changed, or eliminated:

   translate --comma --old='John, Bill' --new=Anny

Do the same on many files, inplace, with no backup:

   translate  -o 'John, Bill' -n Anny -ci *.txt

Replace any two consecutive lines in file file.html containing:


with six lines containing:

 <!--#include virtual="" --> 
 <!--#config timefmt="%Y-%m-%d %X %Z" --> 
 <!--#echo var="LAST_MODIFIED" --> 
 <!--#include virtual="" --> 
 <!-- vim: tw=0 
 translate -lmi -o '
 ' -n '
 <!--#include virtual="" --> 
 <!--#config timefmt="%Y-%m-%d %X %Z" --> 
 <!--#echo var="LAST_MODIFIED" --> 
 <!--#include virtual="" --> 
 <!-- vim: tw=0 
 ' file.html


Wybo Dekker


Released under the GNU General Public License