Book Image

Turning Spreadsheets into Corporate Data

By : Bill Inmon
Book Image

Turning Spreadsheets into Corporate Data

By: Bill Inmon

Overview of this book

Spreadsheets are a popular way to store and communicate business data, but, although they are easy to create and update, they are not reliable enough to be used for making important corporate decisions. With this book, you can gain insight into how to maintain spreadsheets, how to format them, and then convert them into a database of reliable and useful information. Turning Spreadsheets into Corporate Data starts with a quick history of spreadsheet usage. You’ll learn the basics of formatting spreadsheets, including how to handle special characters and column headings, and how to convert the spreadsheet first into an intermediate database and then into corporate data. You will also learn how to utilize the mnemonic dictionary that is created along with the intermediate database. The later chapters discuss the immutability of data and the importance of organizational and political considerations during the data transformation. By the end of this book, you’ll have the skills and knowledge needed to convert your spreadsheets into reliable corporate data.
Table of Contents (16 chapters)
Free Chapter
1
Introduction
14
13: Case Study
15
Glossary
16
Index

PDF and OCR

One alternative is to subject the .pdf image to OCR (optical character recognition) technology. This way, you can read the contents of the spreadsheet on a line-by-line basis.

The best time to turn the OCR option on is during the initial capture of the spreadsheet. However, even if the OCR option was not on when the spreadsheet was initially captured, it is always possible to go back and turn the OCR option on after the fact.

Again, OCR reads and records the spreadsheet, allowing you to identify the row identifiers. However, you still have lost the column names. Without the xlstab characters, it is very difficult to determine what the column names are.