DataConv: Data Conversion and Migration Tools    
Conversion Tools ] Books & Software ] Professional Service ] Submit ] Search ]

Conversion Tools: Unicode


AutoUniConv is an automatic Unicode converter. You do not have to know the input's charset - AutoUniConv automatically identifies the charset and converts it to Unicode afterwards. The most common Unicode Transformation Format schemes (UTF-8, UTF-16, UTF-32) are supported. AutoUniConv is a C/C++ library with an easy to use interface that does not have additional software dependencies.

license: commercial

convmv - filename encoding conversion tool

convmv can convert a single filename, a directory tree or all files on a filesystem to a different encoding. It only converts the encoding of filenames, not files contents. A special feature of convmv is that it also takes care of symlinks: the encoding of the symlink's target will be converted if the symlink itself is being converted. It is also possible to convert directories to UTF-8 which are already partially UTF-8 encoded.
license: GPL


uni2ascii converts UTF-8 Unicode to any of a variety of 7-bit ASCII equivalents: hexadecimal and decimal HTML numeric character entities, \u-escapes, standard hexadecimal, and raw hexadecimal. Such ASCII equivalents are useful when including Unicode text in program source, when entering text into Web programs that can handle the Unicode character set but are not 8-bit safe, and when debugging.
license: GNU General Public License (GPL) ASCII to UNICODE (UTF-8)

See Linux-Magazin 9/2000 p. 136ff. (in German).

#!/usr/bin/perl -piw

no warnings utf8;	# warnings off, bug
tr/\80-\xff//CU		# conversion to UTF-8 UNICODE to ASCII (UTF-8)

from Linux-Magazin 9/2000 p. 136ff.

#!/usr/bin/perl -piw

tr/\0-\xff//UC		# conversion from UTF-8 to Latin-1 (8859-1)

Chinese big5 -> Unicode for WAP

Chinese Big5 -> Unicode online tools. It is a Perl programe, for later GB(simple chinese)/JP -> Unicode online. platform: Web-based
license: free using online contributor <>

GNU Recode

GNU Recode


2utf Translates various charsets to UTF-8
platform: Linux


utf2any translates a file encoded in UTF-7 or UTF-8 (Unicode) into any 7- or 8-bit text format. Currently, mapping tables are supplied for LaTeX, HTML, iso-8859-1 and iso-8859-15. These tables don't provide a complete mapping, but they can be easily extended to personal needs. tex-archive/support
platform: Linux, Unix, MS-DOS
license: GPL

Unicode to UTF-8 Converter

The Unicode to UTF-8 Converter takes Unicode values (in hexadecimal) and encodes them as UTF-8, optionally displaying the resulting character and/or the Unicode description thereof.
license: GNU General Public License (GPL)


hutrans converts plain text into UTF-8 Unicode encoding. The riginal file should contain HTML-style tags for any non-ASCII character. This program is a complement to the functionality of uhtrans, and should be typically used along with it.
platform: Linux
license: BSD type


ptrans converts UTF-8 Unicode files into plain text. Along with other programs found on the same web page, it completes a suite of i18n tools allowing you to convert text files from any character encoding to any other character encoding, and to and from UTF-8 Unicode encoding.
platform: Linux
license: BSD type

Letter Database: Online Conversion of Languages, Character Sets, Names

Letter Database offers Online Conversion of Languages, Character Sets, Names, etc.

Convert Character Set

Convert Character Set is meant to convert text strings between different character set encodings. It features conversion between single byte character sets, from single byte to multi-byte character sets (UTF-8), and from multi-byte to single byte. All conversion output can be saved with numeric entities (browser character set independent). The main requirement is that a character has to be in both character sets, or it will return an error.
license: Freeware


last change Fri Oct 7 2011 :: copyright © Werner Heuser 1999-2017 ::
administrativa :: privacy statement :: sitemap