Ok, here's a question for all you database boffins. In work, we have a *very* large database. And a re-write of our system pending. On this re-write, we intend to optimize the database, and shuffle data around. Here's the current scenario.
We have 53 tables in all. Most of which don't matter to this example. Just 2. There's a "logging" table, which has around 22 million rows of vehicle tracking data. Each row has an ID reference to a vehicle table. Then there's a second table, "addresses", which stores it's own unique key, the actual name of the latitude/longitude, and a code reference to the logging table (one for each line). When a query is done, it has to join these 2 tables together, to get the address from the addresses table, and the rest of the details from the logging table.
Me & my collegue both have different views on how this should be properly designed. Here's my view:
Put the addresses "name" at the end of the logging table, thus removing the addresses table completely. This will eliminate the need to join these tables at all. We end up having complex routines in place to handle situations where logging entries do not have addresses for any reason.
His view:
Split the logging table up, and make a new table (that would effectively replace the addresses table), called "locations". In there, would store the latitude & longitude, and the address name in question. Then the logging table has an entry that references a unique code number in the locations table, still having to join them.
I think both edits are valid, but which one would you pick? Or would you do it a different way?
I'm looking forward to hearing your views on this.
Thanks 
|