首页    期刊浏览 2024年09月20日 星期五
登录注册

文章基本信息

  • 标题:The Data Warehouse Toolkit - Review
  • 作者:Robert Craig
  • 期刊名称:ENT
  • 印刷版ISSN:1085-2395
  • 电子版ISSN:1085-2395
  • 出版年度:1997
  • 卷号:Dec 17, 1997
  • 出版社:101Communications Llc

The Data Warehouse Toolkit - Review

Robert Craig

A regular habitue of the comp.databases.olap news-group frequently encounters questions from people looking for good data warehousing books. I've had the opportunity over the past year to read several data warehousing books and thought the end of the year might be a good time to give you my impressions of them.

Certainly the best one I've read this year has to be The Data Warehouse Toolkit by Ralph Kimball (John Wiley & Sons, Inc.). The subtitle is Practical Techniques for Building Dimensional Data Warehouses, and it certainly lives up to its billing. Kimball, who founded Red Brick Systems, describes how to design and implement a star schema-based dimensional data warehouse. His writing style is clear, his arguments are cogent, and the book is well-organized.

The book uses a variety of settings -- grocery store, insurance company, and so on -- to illustrate design principles for dimensional data warehouses in different industries. These chapters contain many practical tips, along with a description of how to determine the warehouse size. The second half of the book details the process of defining and building a warehouse. An excellent book. Highly recommended.

Unfortunately, the same cannot be said about OLAP Solutions: Building Multidimensional Information Systems by Erik Thomsen (John Wiley & Sons, Inc.). This massive, 565-page tome tries to do too much. Thomsen has produced a comprehensive treatise on OLAP architectures and tools. However, in his zeal to be comprehensive he gets too deeply involved with the details of design and implementation. And when Thomsen gets to example scenarios in Chapter 12, the book quickly turns into a tutorial on TM/1, which is used to illustrate examples. Thomsen's point quickly gets lost in the many pages devoted to screen shots and detailed instructions: "Drag the invstrat dimension into the left-most column ...," for example. Some sections of the book are so replete with graphs, charts and diagrams that the reader is forced to flip several pages ahead of the text reference in order to see the graphics. This, coupled with Thomsen's verbose writing style, makes for tough sledding.

This is too bad, because there is a lot of good information in this book. The sections on dimensional modeling and analysis, sparsity, formulas, links and storage architectures are all excellent and very worthwhile. If you have the patience to slog through (or the impatience to skip over) the sections that don't interest you, this book can be an excellent reference. A worthy attempt, which would have benefited from better editing. Recommended with reservations.

Understanding and Implementing Successful Data Marts by Douglas Hackney (Addison-Wesley Developers Press) is another book I can recommend only with reservations. Hackney has a personable style that probably works in the many seminars he conducts around the country, but that comes across as a little too breezy in a book. He makes a good case for developing data marts, rather than relying on an enterprisewide data warehouse. There are a number of good chapters, such as the one discussing metadata, but these are offset by numerous chapters with extensive checklists describing the software development process. Readers who are actively considering building a data mart probably don't need all the "Software Development 101" tips and pointers that take up way too much space in this book. Recommended for novices new to data warehousing.

Finally, I turn my attention to Data Mining Techniques for Marketing, Sales, and Customer Support by Michael Berry and Gordon Linoff (John Wiley & Sons, Inc.). This is an excellent, thorough and valuable introduction to data mining. The authors, who are data mining consultants, describe their data mining methodology and follow this with a series of chapters illustrating the various categories of data mining algorithms, such as cluster detection, market basket analysis, and neural nets, in detail. They describe the strengths and weaknesses of each category and conclude each chapter with a useful short summary. They point out where it can be helpful to combine two or more data mining techniques to produce the desired result. Chapters on data warehouses and OLAP place data mining in a usable context. Clear and lucid writing, coupled with useful, real-world examples, makes this book a pleasure to read. Highly recommended.

Hope this was helpful. See you all next year!

-- Robert Craig is director, Data Warehousing and Business Intelligence Division, at Hurwitz Group Inc. (Framingham, Mass.). Contact him at rcraig@hurwitz.com or via the Web at www.hurwitz.com.

COPYRIGHT 1997 101 Communications, Inc.
COPYRIGHT 2004 Gale Group

联系我们|关于我们|网站声明
国家哲学社会科学文献中心版权所有