Tycoon Talk
Become a Big fish!
The number 1 forum for online business!
Post topics, ask questions, share your knowledge.
Tycoon Talk is part of Freelancer.com - find skilled workers online at a fraction of the cost.

The Database Forum


You are currently viewing our The Database Forum as a guest. Please register to participate.
Login



Closed Thread
What do you want to know about databases?
Old 10-02-2008, 05:17 PM What do you want to know about databases?
Learning Newbie's Avatar
Defies a Status

Latest Blog Post:
Astounding Republican Paranoia
Posts: 5,662
Name: John Alexander
Trades: 0
This is the companion thread to the DB Resource Sticky. That thread is meant to be like a "help file," and only moderators post in it. The good part of that is, you can trust anything you read in that thread. The bad part is, you can't just add a question, and have it answered there. So, any requests can go here, and we'll figure out the best way to address them.

To get things started, if anybody would like to suggest broad areas (How do I tune indexes? Best way to export data? When to make backups?) they'd like to see covered, specific questions, or anything else, we'll make sure these are covered.
__________________

Please login or register to view this content. Registration is FREE


Please login or register to view this content. Registration is FREE

Last edited by Learning Newbie; 10-02-2008 at 08:51 PM..
Learning Newbie is offline
View Public Profile
 
 
Register now for full access!
Old 10-02-2008, 06:09 PM Re: What do you want to know about databases?
VirtuosiMedia's Avatar
Web Design Made Simple

Posts: 1,228
Trades: 0
This might not be the best question, but I'd be interested in learning more about storing data in a semantic way. Also, are EAV databases scalable and practical and how might one best use one?
__________________
Want new web resources every day? - Follow me on
Please login or register to view this content. Registration is FREE


Please login or register to view this content. Registration is FREE


Please login or register to view this content. Registration is FREE
|
Please login or register to view this content. Registration is FREE
VirtuosiMedia is offline
View Public Profile Visit VirtuosiMedia's homepage!
 
Old 10-02-2008, 07:58 PM Re: What do you want to know about databases?
jamestl2's Avatar
No scale-itch here...

Latest Blog Post:
Wordpress Relative URLs Plugin
Posts: 2,389
Name: <member type="brilliant" alt="foolish">James Lewitzke</member>
Location: / public_html / Universe / Virgo_Supercluster / Local_Group / Milky_Way / Orion_Arm / Solar_System / Earth / North_America / USA / Wisconsin
Trades: 0
There's a DB Resource Sticky?

Anyway, like I said on your blog John, I'm going to be starting to create my own network website of sorts, so I'll be building the MySQL DB from scratch.

In fact, I just started writing it today and I think I understand the basic concepts well (Tables, Fields, Records, etc.) But as I started to fill in the records, I came across some input columns I had no clue about (using phpmyadmin):
  • Collation - It looks like it's used for encoding, but I have no idea which one to pick, and why Latin-Swedish is the default choice
  • Null - What in the world is this for?
  • Attribute - Not sure about this column either
One more thing I'm going to be concerned with too is Database Optimization and the exact number of inserts I'm going to need, but that won't be for later.

Anyway, expect many more threads and questions from me in this forum pretty soon .
__________________
Engipress -
Please login or register to view this content. Registration is FREE


Please login or register to view this content. Registration is FREE
for Wordpress Projects
jamestl2 is offline
View Public Profile Visit jamestl2's homepage!
 
Old 10-02-2008, 08:01 PM Re: What do you want to know about databases?
Learning Newbie's Avatar
Defies a Status

Latest Blog Post:
Astounding Republican Paranoia
Posts: 5,662
Name: John Alexander
Trades: 0
An EAV system can be very hard to scale. It can be done, if the data is inherently set up to allow distribution and partitioning. For everyone else, an EAV (Entity, Attribute, Value) database is generally two tables

Create Table Object (
ObjectID int identity primary key,
ObjectName nvarchar(max)
)

Create Table Attribute (
ObjectID foreign key references Object,
AttributeName nvarchar(max),
AttributeValue nvarchar(max)
)

This allows any number of objects to be stored (persisted) with any number of attributes. Each object can have its own set of attributes. The number of attributes (key value pairs) can be different for every object, however makes sense. This allows for highly unstructured data storage, where each and every object can be stored with whatever properties make sense for it, rather than having to fit it into a given schema.

The downside, though, is it gives up most of the strengths of a relational database system. If you store 1,000 objects with an average of 10 properties each, that's 10,000 rows stored in your Attribute table. That's an easy task for a DB, but say you need to store 10,000 more of them, and the average number of properties goes up to 15. Pretty quickly, this is going to get very big. Having only two tables, it will eventually get to the point that lookups start taking sizeable fractions of a second.

This can be ameliorated by using a natural key. Ideally a 16 byte hash value of the object. You might store 100 objects per table, or any number that performs well. Given an object to get properties for, you would calculate the hash value, then go to the appropriate table or database to get the real data. If you had a million objects to keep track of, but broke them into lists of 100 each, using a method like this, the seek times can remain very low.

Another downside is that, from one perspective, this is as simple as can be. But from another, it's actually kind of opaque. If you were hired to manage a database set up in this way, or to build web pages around one, it might take some wrapping your head around until you get the hang of it? I don't personally think it's any more combersome to learn (probably less!) than any other schema, but a lot of people seem to object, saying it's too complicated to see a lot of property douples, with ID values next to them.

Now, I've been planning to add a section for schema objects and schema definition, with some of the more common ones described, like a star schema, third normalized form, and such. It seems like this would be a good one to include. Honestly, for small web sites, it's not going to be much of a burden with good indexing, and it's highly flexible.
__________________

Please login or register to view this content. Registration is FREE


Please login or register to view this content. Registration is FREE
Learning Newbie is offline
View Public Profile
 
Old 10-02-2008, 09:02 PM Re: What do you want to know about databases?
Learning Newbie's Avatar
Defies a Status

Latest Blog Post:
Astounding Republican Paranoia
Posts: 5,662
Name: John Alexander
Trades: 0
Quote:
Originally Posted by jamestl2 View Post
There's a DB Resource Sticky?
There is now! It's not a sticky, yet, tho. Still filling out the content - I don't want to get people's hopes up until the answers are there.

Quote:
Originally Posted by jamestl2 View Post
  • Collation - It looks like it's used for encoding, but I have no idea which one to pick, and why Latin-Swedish is the default choice
How strings are sorted and compared. Really it's just the character set. This is meaningless for English, because we have no "special" characters by definition. European languages have a few accents, but enough to fit inside a UTF 8. Only a-z, A-Z, 0-9, and ~-+ all fit in UTF-7, leaving you 128 slots left, and you can decide what they represent.

Quote:
Originally Posted by jamestl2 View Post
  • Null - What in the world is this for?
In most programming languages, a boolean is a two pronged variable, with true and false. In a database, unless you disallow it, null is the third prong. It's the absence of a value. Technically, it means "I don't know." It's treated specially, in that nothing equals null, and null plus (or minus or times, etc) anything is null. This makes sense, because 5 + "I don't know" is still unknown.
__________________

Please login or register to view this content. Registration is FREE


Please login or register to view this content. Registration is FREE
Learning Newbie is offline
View Public Profile
 
Old 10-02-2008, 10:41 PM Re: What do you want to know about databases?
jamestl2's Avatar
No scale-itch here...

Latest Blog Post:
Wordpress Relative URLs Plugin
Posts: 2,389
Name: <member type="brilliant" alt="foolish">James Lewitzke</member>
Location: / public_html / Universe / Virgo_Supercluster / Local_Group / Milky_Way / Orion_Arm / Solar_System / Earth / North_America / USA / Wisconsin
Trades: 0
Alright, thanks .

Just realized I'm going to have to create a LOT of records for everything that people will be using on the site. Something I'm wondering about here is whether or not I'm going to be creating *too many* records, as some may cross-over into other tables. I have a feeling I won't be needing a few of them.

The thing is I don't have absolutely everything planned out for the main site right away either. So would it be better you think to concentrate on creating the DB first? Or design and program the actual site first?

I know I'll probably be coming back to the DB anyway for whatever small changes that are necessary, but I don't want to get too ahead of myself.
__________________
Engipress -
Please login or register to view this content. Registration is FREE


Please login or register to view this content. Registration is FREE
for Wordpress Projects
jamestl2 is offline
View Public Profile Visit jamestl2's homepage!
 
Old 10-03-2008, 12:01 AM Re: What do you want to know about databases?
VirtuosiMedia's Avatar
Web Design Made Simple

Posts: 1,228
Trades: 0
I think you may want to take the time to plan ahead a little. Usually my databases dictate what I program to a great degree, so if you have that part straightened out, it could save you a lot of programming. One thing you might want to look into is normalization, which I'm sure John could explain and argue for a lot better than I could, even though I try to use it in all of my database design.
__________________
Want new web resources every day? - Follow me on
Please login or register to view this content. Registration is FREE


Please login or register to view this content. Registration is FREE


Please login or register to view this content. Registration is FREE
|
Please login or register to view this content. Registration is FREE
VirtuosiMedia is offline
View Public Profile Visit VirtuosiMedia's homepage!
 
Old 10-03-2008, 03:16 AM Re: What do you want to know about databases?
VirtuosiMedia's Avatar
Web Design Made Simple

Posts: 1,228
Trades: 0
Thanks, John. Once again, I'm learning a great deal about databases from you. And of course, as always, I have more questions.

Quote:
An EAV system can be very hard to scale. It can be done, if the data is inherently set up to allow distribution and partitioning.
So what would a scalable version look like? I'm having a little trouble wrapping my mind around how something like that could be effectively or logically partitioned.


Quote:
Create Table Object (
ObjectID int identity primary key,
ObjectName nvarchar(max)
)

Create Table Attribute (
ObjectID foreign key references Object,
AttributeName nvarchar(max),
AttributeValue nvarchar(max)
)

This allows any number of objects to be stored (persisted) with any number of attributes. Each object can have its own set of attributes. The number of attributes (key value pairs) can be different for every object, however makes sense. This allows for highly unstructured data storage, where each and every object can be stored with whatever properties make sense for it, rather than having to fit it into a given schema.
Would it perhaps be an improvement to add a third table that connects the objects with the attributes by foreign keys while removing the foreign key in the attributes? The example above seems like a many-to-many relationship, so to me, an associative table would make sense in this case.

Upon thinking it over a little more, how would you connect objects to each other? Or would you need to? I guess it depends on what the definition of an object is. So, for a use case example, suppose I have an object that represents a user id. I could attach first name, last name, email address, etc. to that object. But if that user makes a comment in a blog, is that comment an attribute or an object? If it's an attribute, how could I attach a timestamp, a vote, or a reply to that comment? If it's an object, how could attach it to the user object?

As an alternative, could you store all attributes and objects in the same table and use the second table to house two foreign keys (or hashes) that represent the same column in the first table (albeit with different key values)?

Quote:
The downside, though, is it gives up most of the strengths of a relational database system. If you store 1,000 objects with an average of 10 properties each, that's 10,000 rows stored in your Attribute table. That's an easy task for a DB, but say you need to store 10,000 more of them, and the average number of properties goes up to 15. Pretty quickly, this is going to get very big. Having only two tables, it will eventually get to the point that lookups start taking sizeable fractions of a second.
Any idea on what the upper limit might be, or at least the limit at which lookups start to bog down noticeably?

Quote:
This can be ameliorated by using a natural key. Ideally a 16 byte hash value of the object. You might store 100 objects per table, or any number that performs well. Given an object to get properties for, you would calculate the hash value, then go to the appropriate table or database to get the real data. If you had a million objects to keep track of, but broke them into lists of 100 each, using a method like this, the seek times can remain very low.
I'm a little unclear about your example here. Which table is the hash stored in and what exactly does it represent? What I'm kind of getting is that instead of having a primary key that auto-increments, you would programmatically assign a unique 16 byte hash as a unique identity for that object across all of the tables that objects are stored in. Would and could you also do the same for the attributes?

Quote:
Another downside is that, from one perspective, this is as simple as can be. But from another, it's actually kind of opaque. If you were hired to manage a database set up in this way, or to build web pages around one, it might take some wrapping your head around until you get the hang of it? I don't personally think it's any more combersome to learn (probably less!) than any other schema, but a lot of people seem to object, saying it's too complicated to see a lot of property douples, with ID values next to them.
I'm still trying to wrap my head around it too. I haven't researched it a whole lot yet (I've been buried in JavaScript lately), but it seems like it might be an interesting way to store data. I'm particularly interested in semantic data storage and retrieval and an EAV database appears like it might be fairly easy to search against, but I can definitely see some of the drawbacks as well. Have you personally used it and would you recommend using it, or which schema do you recommend?

Quote:
Now, I've been planning to add a section for schema objects and schema definition, with some of the more common ones described, like a star schema, third normalized form, and such. It seems like this would be a good one to include. Honestly, for small web sites, it's not going to be much of a burden with good indexing, and it's highly flexible.
Please do write more on the different schemas. I don't know why, but I find them fascinating for some reason.
__________________
Want new web resources every day? - Follow me on
Please login or register to view this content. Registration is FREE


Please login or register to view this content. Registration is FREE


Please login or register to view this content. Registration is FREE
|
Please login or register to view this content. Registration is FREE
VirtuosiMedia is offline
View Public Profile Visit VirtuosiMedia's homepage!
 
Old 10-03-2008, 12:43 PM Re: What do you want to know about databases?
jamestl2's Avatar
No scale-itch here...

Latest Blog Post:
Wordpress Relative URLs Plugin
Posts: 2,389
Name: <member type="brilliant" alt="foolish">James Lewitzke</member>
Location: / public_html / Universe / Virgo_Supercluster / Local_Group / Milky_Way / Orion_Arm / Solar_System / Earth / North_America / USA / Wisconsin
Trades: 0
Quote:
Originally Posted by VirtuosiMedia View Post
I think you may want to take the time to plan ahead a little. Usually my databases dictate what I program to a great degree, so if you have that part straightened out, it could save you a lot of programming.
So you think I should write the WHOLE DB first? (Well, I mean everything that I plan to put on the main site at this point in time). Before I even touch the design or programming aspect.

Quote:
Originally Posted by VirtuosiMedia View Post
One thing you might want to look into is normalization, which I'm sure John could explain and argue for a lot better than I could, even though I try to use it in all of my database design.
Alright, thanks. Yeah I have no idea what "normalization" is either.
__________________
Engipress -
Please login or register to view this content. Registration is FREE


Please login or register to view this content. Registration is FREE
for Wordpress Projects
jamestl2 is offline
View Public Profile Visit jamestl2's homepage!
 
Old 10-03-2008, 12:56 PM Re: What do you want to know about databases?
Learning Newbie's Avatar
Defies a Status

Latest Blog Post:
Astounding Republican Paranoia
Posts: 5,662
Name: John Alexander
Trades: 0
Quote:
Originally Posted by jamestl2 View Post
The thing is I don't have absolutely everything planned out for the main site right away either. So would it be better you think to concentrate on creating the DB first? Or design and program the actual site first?
The database has a very specific (even if it's ill defined right now) job. It's providing a service to your web application. If you don't know what's needed of the DB, you're better off waiting on that, and working on the web site. You'll get to a point where your application needs to remember X, and at that point, you have a fairly well defined need. Which makes it easier to learn how to address.
__________________

Please login or register to view this content. Registration is FREE


Please login or register to view this content. Registration is FREE
Learning Newbie is offline
View Public Profile
 
Old 10-03-2008, 03:38 PM Re: What do you want to know about databases?
Learning Newbie's Avatar
Defies a Status

Latest Blog Post:
Astounding Republican Paranoia
Posts: 5,662
Name: John Alexander
Trades: 0
Quote:
Originally Posted by VirtuosiMedia View Post
So what would a scalable version look like? I'm having a little trouble wrapping my mind around how something like that could be effectively or logically partitioned.
So this is what I was talking about in the previous post, with (for example) hash values. Let's take a more simplified version, and assume we're responsible for a database of people. We decide to use their email address as the table's primary key, because each email has only one owner, and so is unique.

For the sake of easy math, say we have 26,000 users, and even distribution. Everybody who's email starts with a goes to a table called Person_A, everybody who's email starts with b goes to Person_B, and so on down the line. Now we have 26 tables of 1,000 each. Their attributes can also be stored this way - as Properties_A, Properties_B, etc. To search all users with no prior knowledge will take a little bit longer this way, but, to look up a specific user will be faster. The trouble is, there won't be as many email addresses starting with X, which is why a hash value might be a better idea.

Quote:
Originally Posted by VirtuosiMedia View Post
Upon thinking it over a little more, how would you connect objects to each other? Or would you need to?
Generally, this should come in the form of requirements and use cases. There are endless numbers of implementations to choose from and features that can be supported - as database programmers, we need a "tie breaker" to choose. If objects need to be connected with one another, how? If not, it's usually best to avoid the complexity and overhead of representing this.

Quote:
Originally Posted by VirtuosiMedia View Post
As an alternative, could you store all attributes and objects in the same table and use the second table to house two foreign keys (or hashes) that represent the same column in the first table (albeit with different key values)?
I'm fond of the fish hook relationship, myself. It's called that because of the way it looks to draw a database diagram showing a single table related to itself.

Create Table Fishhook (
EntryID int identity primary key,
ParentID int null foreign key references Fishhook,
Name nvarchar(max),
Value nvarchar(max)
)

This models a tree. It allows virtually unlimited objects and attributes to be modeled, with a hierarchical relationship. When ParentID is null, the current object is a root node, otherwise, you can follow the chain. Recursively.

The downside is, SQL isn't especially good at this sort of recursive paradigm. Think of the query required to load an unknown number of items by following the chain of parents.

Quote:
Originally Posted by VirtuosiMedia View Post
Any idea on what the upper limit might be, or at least the limit at which lookups start to bog down noticeably?
That's going to depend on hardware, index structure, and the number of concurrent users firing queries. The faster the storage subsystem is, the less painful this will be, at any level. The more RAM you have, with most database servers, the more data will live in memory, and the less disc IO will be incurred, assuming you need the same rows often. Indexes that allow for seeks as opposed to scans can buy you orders of magnitude. Finally, the Enterprise Edition of SQL Server allows different transactions to share table scans. When two queries need the same data, the storage engine will return it to both, doubling performance. I'm pretty sure Oracle does the same. With all of this, you should be able to have hundreds of thousands of rows on a modest box.
__________________

Please login or register to view this content. Registration is FREE


Please login or register to view this content. Registration is FREE
Learning Newbie is offline
View Public Profile
 
Old 10-31-2008, 03:16 PM Re: What do you want to know about databases?
VirtuosiMedia's Avatar
Web Design Made Simple

Posts: 1,228
Trades: 0
Here is an interesting link from Wikipedia. I didn't realize that there were 6 forms of normalization. I only knew about the first 3.
__________________
Want new web resources every day? - Follow me on
Please login or register to view this content. Registration is FREE


Please login or register to view this content. Registration is FREE


Please login or register to view this content. Registration is FREE
|
Please login or register to view this content. Registration is FREE
VirtuosiMedia is offline
View Public Profile Visit VirtuosiMedia's homepage!
 
Old 11-08-2008, 11:18 AM Re: What do you want to know about databases?
Average Talker

Posts: 23
Trades: 0
I would like a list of the best visual database modeling applications, and a selection of book recommendation
Thanks.
calande is offline
View Public Profile
 
Closed Thread     « Reply to What do you want to know about databases?
 

Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are Off
Pingbacks are Off
Refbacks are Off





   
RSS Feed  Feeds: RSS   JS   XML
RSS Feed  Feeds for this forum: RSS   JS   XML



Page generated in 0.44654 seconds with 11 queries