The Data Warehouse Toolkit - Reviews: looking before leaping on the `E' bandwagon - Review
Jennifer C. PattersonThe Data Warehouse Toolkit Ralph Kimball (John Wiley & Sons, Inc., 1996)
You've decided that your university would benefit from building a data warehouse. Bearing in mind Georgetown University's Ronald Allen's comment to "inform yourself" (see "Designing for History"), you set out to do some reading. Where do you begin?
One possible starting point would be the books by Ralph Kimball. Kimball, a speaker, teacher, and consultant on data warehousing, has written a trilogy of books designed to help the project manager, data-warehouse developer, or vice president of information systems to design, build, and maintain a data warehouse, drawing information from transactional systems and Web activity.
The seminal book in the series is The Data Warehouse Toolkit.
Designed as a primer for the technically minded, it covers all of the basic knowledge and competencies needed to build and maintain a data warehouse. This includes assessing the needs of the university, designing the warehouse, adding snapshots and summaries of data, and bringing the data to the end user's desktop in an easily usable form. Numerous diagrams and highlighted hints will guide the reader through the process with a minimum of confusion. Additionally, Kimball has included a chapter with lists of questions to ask your end users in order to design the most useful data warehouse for their needs.
The next entry in the set is The Data Warehouse Lifecycle Toolkit, on which Kimball is joined by co-authors Laura Reeves, Margy Ross, and Warren Thornthwaite. Targeting the IT manager or other professional charged with the day-to-day responsibilities of managing and expanding an existing data warehouse, this book expands upon the topics addressed in its predecessor. And, like its predecessor, this book benefits from superb organization and a wealth of charts, tips, diagrams, and lists. Both books, although now a few years out of date, still include a wealth of valuable information that will stay current as long as relational databases and the star schema of data-warehouse organization are used.
The most recent entry in the series is The Data Webhouse Toolkit, written with Richard Merz. This volume demonstrates just how quickly technology and expectations have changed in the four years since The Data Warehouse Toolkit made its debut. Imagine the power of gathering admissions data not just from the admissions office transactional system but from every prospective student who visits your Web site, searches for information, or requests information. No longer are data warehouses limited to gathering data from only the in-house transactional systems; now, the Web is a source of data as well.
Aimed at the IT manager who works with an existing data warehouse and who may have been charged with some Webmaster responsibilities, this book fulfills the expectations created by the previous two to deliver well-organized content with numerous helpful hints for the reader.
Intercepting and capturing the stream of data from a Web site requires awareness of a new set of concerns. How should you design your Web site for the best--and most data-generating--experience? How do you ensure your visitors' trust and privacy while still gathering data? And where will we go next?
Kimball has a prediction to answer that last question. He believes "the next decade will be the coming of age for data mining," which requires tools to make sense of behavior from the data gathered. With the flood of data now available for analysis, interpretation of patterns and behavior will become increasingly important.
Kimball's trilogy of data-warehousing books belongs not just on the shelves but in the hands of any individual or team charged with designing, building, and maintaining a data warehouse. The books succeed in doing much of the "heavy lifting" required by such a project by pointing out design tips, providing insightful diagrams, and warning the reader away from common mistakes.
The books do require a certain familiarity with technology and the principles of database and Web design, which may make them unsuitable for a casual reader. However, they are unbeatable insurance for a data-warehousing team that wants to educate itself before calling consultants, vendors, and training institutes for a new data-warehousing project.
Jennifer C. Patterson is an independent writer based in Centerville, Ohio, who specializes in higher education and technology issues. She holds a master's degree in college student personnel services from Miami University and has worked as a registrar, academic adviser, admission counselor, financial aid counselor, and instructor. She can be reached at jcpatterson@prodigy.net.
COPYRIGHT 2000 Professional Media Group LLC
COPYRIGHT 2001 Gale Group