Greg Wilson: Data Crunching, 2005

Book Image "Every day, all around the world, programmers have to recycle legacy data, translate from one vendor's proprietary format into another's, check that configuration files are internally consistent, and search through web logs to see how many people have downloaded the latest release of their product. This kind of "data crunching," may not be glamorous, but knowing how to do it efficiently is essential to being a good programmer. This book describes the most useful data crunching techniques, explains when you should use them, and shows how they will make your life easier. Along the way, it will introduce you to some handy, but under-used, features of Java, Python, and other languages. It will also show you how to test data crunching programs, and how data crunching fits into the larger software development picture."

David Cross: Data Munging with Perl, 2001

Techniques for data recognition, parsing, transformation and filtering

Book Image This book shows you how to process data productively with Perl. It discusses general munging techniques and how to think about data munging problems. You will learn how to decouple the various stages of munging programs, how to design data structures, how to emulate the Unix filter model, etc. If you need to work with complex data formats it will teach you how to do that and also how to build your own tools to process these formats. The book includes detailed techniques for processing HTML and XML. And, it shows you how to build your own parsers to process data of arbitrary complexity.

DataWatch: Monarch

Book Image "A powerful Windows-based report mining tool that lets you easily access, extract, and manipulate live data from virtually any existing computer-generated report. Monarch lets non-technical professionals access and manipulate data from virtually any existing computer-generated report. Users can also search the report files for information, print selected pages, copy and paste information to other applications or create a new database. The extracted data can be filtered, sorted and rearranged. Monarch's new portable reports feature allows quick distribution of electronic reports via the Internet to virtually anyone in your organization. With Monarch, there's no rekeying data from hardcopy reports, then proofing to correct mistakes; no required outside custom programming; and no database security issues, because theres no need to access the central database."

