Conversion Tools: Databases From Text Files
The
SANKHYA Translation Framework (STF)
is a novel framework for building
dynamic model driven parsers and data integration systems. STF can be used
to automate EAI activities like document and message processing, protocol
conversion, text to XML transformations, SQL database to XML
transformations, C++ and Java code generation, server page processing,
data conversion and adapter development.
STF uses STML, a powerful language for modeling language grammars or
schema of documents, and for specifying translation and transformation
rules in a unified manner.
STML Translator (ST) is a model driven translation and transformation tool
(executable). ST can parse a document, verify that the document conforms
to the model, and automatically translate/transform the document to a
different representation (Example: Text to XML) using the rules specified
in an STML model file.
platform: Linux, MS-Windows
license: Commercial
Import Wizard
can import printer spool files, delimited and fixed width
files, as well as HTML tables into Access, Excel, MS-SQL Server or ODBC
databases. The software goes beyond the standard text import functionality
by allowing imports with multiple lines per record and importing headers and
footers. The software is useful to import information from any
computer-system regardless whether the system is a legacy mainframe, DOS,
Unix or Windows system.
In addition to importing data, Import Wizard lets you manipulate data prior
to importing it into the database. You're usualy forced to import everything
into a table and then clean it up, but Import Wizard contains a number of
features that will get your data in shape faster. You can identify data
that's separated by a delimiter (such as lastname, firstname or city, state)
and split the data into separate fields. You can adjust the cuttoff year for
20th and 21th century dates and import nonstandard formatted dates, which
you ordinally might have to import as text and go through a conversion step
after the fact. Another feature is the ability to create fields that store
formula results based on source data. For example, you could create an IIf()
formula that conditionally fills in a code based on existing data. You can
also use wildcard characters to select and import multiple files. In
addition, you can apply filters to restrict what's imported.
Import Wizard is comprised of three parts: a standalone executable and
add-ins for Access and Excel. A free 30-day trial version is available.
Platform:
MS-Windows
License:
Shareware, Free Trial
MONARCH :
"A powerful Windows-based report mining tool that lets you easily access,
extract, and manipulate live data from virtually any existing
computer-generated report. Monarch lets non-technical professionals access and
manipulate data from virtually any existing computer-generated report. Users
can also search the report files for information, print selected pages, copy
and paste information to other applications or create a new database. The
extracted data can be filtered, sorted and rearranged. Monarch's new portable
reports feature allows quick distribution of electronic reports via the
Internet to virtually anyone in your organization. With Monarch, there's no
rekeying data from hardcopy reports, then proofing to correct mistakes; no
required outside custom programming; and no database security issues, because
theres no need to access the central database."
platform: MS-Windows
license: Commercial
EUDI
converts data from any application report or screen into Excel.
Just print the report or screen and EUDI will extract your data into Excel. EUDI was designed for Excel users and is very easy to operate. If you know how to define a table in Excel you can get the data from your reports and screens into neatly formatted excel sheets.
platform: MS-Windows
license: commercial
Cambio und DJXL
"Cambio answers the classic text mining challenge of creating structured
data tables from raw, irregular text files. Cambio enables users to process
live text feeds, parse XML documents, glean data from Internet sites and text
reports, and mine data from numerous other irregular sources. Cambio employs a
visual interface to automate the creation of extraction rules and conditions -
data fields and recognition tags are visually branded within a display of the
source data file. Cambio scripts can be used by Data Junction and DJEngine to
deliver a seamless transformation of irregular text files into hundreds of
different database formats.
Developers can infinitely customize Cambio scripts beyond the capabilities
of Cambio's visual pattern recognition technology using DJXL. DJXL is a line
oriented programming language developed by Data Junction Corporation and
utilized by Cambio. It is most useful in the creation of complex scripts
necessary for files whose patterns and rules are too complex to be expressed by
Cambio. DJXL scripts require Data Junction or DJEngine for execution.
DJEngine
is a programmable, embeddable engine for executing projects and
transformations designed with Data Junction and Cambio and for automatically
creating, mapping and running default transformations. The product is a pure
execution engine, without any user interface components, making it ideal for
environments where data transformations need to be executed on demand or
scheduled on a regular basis.
DJEngine includes an API for integration with other programs as well as a
command-line interface that drives the conversion process using options entered
with the command line. The API enables tight control of the transformation
process, including error handling, record filtering, transaction coordination,
user interaction and on-the-fly parameter changes. DJEngine's scalability and
easy integration make it perfect for embedding in applications as well as for
implementing data warehouses, data marts, and diverse data migration and
replication strategies throughout the enterprise. Its small footprint allows it
to run on Windows laptops, but it is robust enough to execute on large UNIX
servers.
The Streaming Data SDK can be used to configure DJEngine to read or write
streams of data in the form of messages or real time feeds."
The slice program reads an input file and divide its prepared ASCII contents
into possibly overlapping slices. These slices are determined by enclosing
blocks which are defined by begin and end delimiters which have to be already
in the file. The final output gets calculated by a slice term consisting of
slice names, set theory operators and optional round brackets. For more
information, please visit WML .
pg2xbase
is a set of utilities for converting PostGres database tables to
and from DBF databases.
platform: Linux,
license: GPL
This program takes an xBase-file and sends queries to an
PostgreSQL-server to insert it into a table.
platform: Linux
package: Debian
Gupta SqlBase to Oracle :
Perl script by Joerg Pissarek to convert a Gupta SqlBase UNLOAD file into an
Oracle DBMS. May also work as an example how to convert unload files from
different SQL DBMS systems.
platform: Perl
Dbf2pos
converts database DBF
files into SQL code. It also allows you to extract fields you want and set
ranges in extracted fields.
CsvBridge
is a Java utility that allows you to import and export
comma-delimited data (csv files) to and from a PostgreSQL database.
license: GPL
Shape Tools
consists of some very simple programs that convert
ARC/INFO EXPORT (e00) format files to ESRI/Shapefiles (shp).
license: GNU General Public License (GPL)
|