Book Image

Managing Multimedia and Unstructured Data in the Oracle Database

By : MARCEL KRATOCHVIL
Book Image

Managing Multimedia and Unstructured Data in the Oracle Database

By: MARCEL KRATOCHVIL

Overview of this book

Multimedia is the new digital frontier. Managers, software architects, administrators and developers need to fully comprehend this exciting new technology as its widespread use and acceptance cannot be ignored any longer."Managing Multimedia and Unstructured Data in the Oracle Database" will give you a complete understanding of how to manage all data, especially multimedia. You will learn all the latest terminology, how to set up a database, load digital objects, search on them and even how to sell them. Whether you are a manager or database administrator, this book will give you the knowledge you need to take control of this rapidly growing and industry- changing technology. Technology which is transforming our lives.Starting with the basic principles of unstructured data and detailing the concepts behind multimedia warehouses and digital asset management systems, this book will describe how to load this data, search against it, display it intelligently, and deliver it to customers and users. Learn how all these concepts work within the Oracle 11g R2 database environment and how to tune the database effectively to manage it.Begin to learn about this new and exciting field and use it to give your business a competitive edge or give yourself the ability to take a leadership role in this exciting new computing genre.
Table of Contents (22 chapters)
Managing Multimedia and Unstructured Data in the Oracle Database
Credits
About the Author
Acknowledgement
About the Reviewers
www.PacktPub.com
Preface
Index

Chapter 1. What is Unstructured Data?

There has been a noticeably slow uptake in the use of databases to manage unstructured data, in particular multimedia data. The technology at both the hardware and software levels for the management of multimedia is both mature and stable. What is preventing sites from the move to storing multimedia in the database is attributed to a lack of expertize, understanding, and a conservative view fostered by a number of factors including historical issues with performance and integration software.

Initially it is important to define what multimedia is in relation to structured and unstructured data. Unstructured data is any data that is not stored in a structured format. Structured data is anything that has an enforced composition to the atomic data types(1).

A relational database stores data in a structured format. Other non-relational databases also store their data in a structured format, so relational data can be considered a subset of structured data. XML is also considered structured, as well as data stored inside object-oriented databases. Because the structure of XML is fluid, one can consider XML as semi-structured.

There is a large amount of unstructured data in the real world that needs managing. In the last ten years most organizations have begun to recognize that there is a great need to manage it and to understand it. As unstructured data refers to anything that is not structured; it can become very difficult to understand what is out there and how to deal with it. The traditional thinking has been to just treat it as a blob (binary large object), but with a greater understanding of the variety of unstructured data types that exist, the need to manage them has grown.

To help understand this point think of geometry and the rules (mathematics) associated with it. When mathematicians tried to come to grips with circles, triangles, and shapes it was seen to be so complex, they started on the basic concepts first. This was dealing with geometry in a two-dimensional world. In this world view, triangles had three sides with three angles that always added up to 180 degrees. Parallel lines never met. By just focusing on this world view a greater understanding of geometry was formed. Core principles were calculated along with a lot of formulas and mathematics. In this analogy, the two-dimensional world is equivalent to the structured data.

Once this two-dimensional world reached a stage of becoming well studied and understood, focus was moved to the real three-dimensional world to see how it would behave. The three-dimensional world proved to be very complex and so made us focus on key areas that could be understood. This included the study of knots, symmetry, surfaces with holes, and curves. Some of the two-dimensional rules flowed through to the three-dimensional world but fewer didn't. Parallel lines can meet and triangles can have more or less than 180 degrees.

In this analogy the unstructured data is the three-dimensional world and there is a need to understand what is in it. Just like there exists no thorough understanding of three-dimensional geometry, so there is no full understanding of the unstructured data. It is an evolving and growing discipline as more information and experiences are gathered, tested, and learnt. So, like the notion of studying knots, holes, and curves, one can also focus on key areas of the unstructured data and learn from them. One key component is multimedia, which contains video, audio, photographs, and documents.

Multimedia is also referred to as rich media. It's not just limited to the four types identified and some even might debate whether documents are a component of multimedia. As will be shown, when breaking down multimedia into its fundamental components, one can classify these multimedia types and then develop new types from it. This includes three-dimensional objects, simulation data, and neural network data.

The analogy of comparing three-dimensional geometry to unstructured data works well and one has to also consider that mathematicians have gone beyond three-dimensional geometry into multi-dimensional geometry in an effort to help explain some key components of string theory, quantum theory, and astronomy. There are still a lot of unknowns with unstructured data. The recent introduction into the world of quantum computing using qubits to store information will undoubtedly push the field of unstructured data management into complete new areas(2).

Just like there is overlap between the two-dimensional world with the three-dimensional world, so there is between multimedia and structured data. The two are dependent on each other at the moment, but eventually with improvements in technology this might change. The rules formulated today might change tomorrow. It's important to realize that as technology changes the rules change. Working in multimedia is trying to hit a moving target. What is right today might be invalidated tomorrow.