Book Image

Creating your MySQL Database: Practical Design Tips and Techniques

By : Marc Delisle
Book Image

Creating your MySQL Database: Practical Design Tips and Techniques

By: Marc Delisle

Overview of this book

For most of us, setting up the database for an application is often an afterthought. While you don't need to be a professional database designer to create a working application, knowing a few insider tips and techniques can make both the process easier and the end result much more effective. This book doesn't set out to make you an expert in data analysis, but it does provide a quick and easy way to raise your game in this essential part of getting your application right.
Table of Contents (12 chapters)

The Tale of the Too Wide Table


This book focuses on representing data in MySQL. The containers of tables in MySQL, and other products are the databases. It is quite possible to have just one table in a database and thus avoid fully applying the relational model concept in which tables are related to each other through common values; however we will use the model in its normal way: having many tables and creating relations between them.

Note

This section describes an example of data crammed into one huge table, also called a too wide table because it is formed with too many columns. This too wide table is fundamentally non-relational.

Sometimes the data structure needs to be reviewed or evaluated, as it might be based on poor decisions in terms of data naming conventions, key choosing, and the number of tables. Probably the most common problem is that the whole data is put into one big, wide table.

The reason for this common structure (or lack of structure) is that many developers think in terms of the results or even of the printed results. Maybe they know how to build a spreadsheet and try to apply spreadsheet principles to databases. Let's assume that the main goal of building a database is to produce this sales report, which shows how many cars were sold in each month, by each salesperson, describing the brand name, the car model number, and the name.

Salesperson

Period

Brand Name

Car model number

Car model name and year

Quantity sold

Murray, Dan

2006-01

Fontax

1A8

Mitsou 2007

3

Murray, Dan

2006-01

Fontax

2X12

Wanderer 2006

7

Murray, Dan

2006-02

Fontax

1A8

Mitsou 2007

4

Smith, Peter

2006-01

Fontax

1A8

Mitsou 2007

1

Smith, Peter

2006-01

Licorne

LKC

Gazelle 2007

1

Smith, Peter

2006-02

Licorne

LKC

Gazelle 2007

6

Without thinking much about the implications of this structure, we could build just one table, sales:

salesperson

brand

model_number

model_name_year

qty_2006_01

qty_2006_02

Murray, Dan

Fontax

1A8

Mitsou 2007

3

4

Murray, Dan

Fontax

2X12

Wanderer 2006

7

 

Smith, Peter

Fontax

1A8

Mitsou 2007

1

 

Smith, Peter

Licorne

LKC

Gazelle 2007

1

6

At first sight, we have tabularized all the information that is needed for the report.

Note

The book's examples can be reproduced using the mysql command-line utility, or phpMyAdmin, a more intuitive web interface. You can refer to Mastering phpMyAdmin 2.8 for Effective MySQL Management book from Packt Publishing (ISBN 1-904811-60-6). In phpMyAdmin, the exact commands may be typed in using the SQL Query Window, or we can benefit from the menus and graphical dialogs. Both ways will be shown throughout the book.

Here is the statement we would use to create the sales table with the mysql command-line utility:

CREATE TABLE sales (
salesperson char(40) NOT NULL,
brand char(40) NOT NULL,
model_number char(40) NOT NULL,
model_name_year char(40) NOT NULL,
qty_2006_01 int(11) NOT NULL,
qty_2006_02 int(11) NOT NULL
) ENGINE=MyISAM DEFAULT CHARSET=latin1;

In the previous statement, while char(40) means a column with 40 characters, int(11) means an integer with a display width of 11 in MySQL.

Using the phpMyAdmin web interface instead, we would obtain:

Here we have entered sample data into our sales table:

INSERT INTO sales VALUES ('Murray, Dan', 'Fontax', '1A8', 'Mitsou 2007', 3, 4);
INSERT INTO sales VALUES ('Murray, Dan', 'Fontax', '2X12', 'Wanderer 2006', 7, 0);
INSERT INTO sales VALUES ('Smith, Peter', 'Licorne', 'LKC', 'Gazelle 2007', 1, 6);
INSERT INTO sales VALUES ('Smith, Peter', 'Fontax', '1A8', 'Mitsou 2007', 1, 0);

However this structure has many maintenance problems. For instance, where do we store the figures for March 2006? To discover some of the other problems, let's examine sample SQL statements we could use on this table to query about specific questions, followed by the results of those statements:

/* displays the maximum number of cars of a single model sold by each vendor in January 2006 */
SELECT salesperson, max(qty_2006_01)
FROM sales
GROUP BY salesperson
/* finds the average number of cars sold by our sales force taken as a whole, in February 2006 */
SELECT avg(qty_2006_02)
FROM sales
WHERE qty_2006_02 > 0
/* finds for which model more than three cars were sold in January */
SELECT model_name_year, SUM(qty_2006_01)
FROM sales
GROUP BY model_name_year
HAVING SUM(qty_2006_01) > 3

We notice that, although we got the answers we were looking for, with the above SQL queries, we would have to modify column names in the queries to obtain results for other months. Also, it becomes tricky if we want to know the month for which the sales have surpassed the yearly average, because we have to potentially deal with twelve column names. Another problem would arise when attempting to report for different years, or to compare a year with another one.

Moreover, a situation that could demonstrate the poor state of this structure is the need for a new report. A structure that is based too closely on a single report instead of being based on the intrinsic relations between data elements does not scale well and fails to accommodate future needs.

Chapter 4 will unfold those problems.