 |
05-07-2007, 07:30 PM
|
Speed for Join statement
|
Posts: 522
Name: Gabe Solomon
Location: Romania
|
Here is one for you guys ... 
what is much faster running a single querry with a join statement or
making more then one, separate querys.
I read that in theory ( i bet i will get some arguements about this ) Join is the fastest ... but i found that in some practical experiments .... doing separate querry's is much faster.
So what do you think ?
|
|
|
|
05-08-2007, 03:27 AM
|
Re: Speed for Join statement
|
Posts: 984
Name: Jeremy Miller
Location: Reno, NV
|
IMO the question is oversimplified. There's more than one type of join and they each have performance benefits/costs depending on the query and indices. Generally speaking, JOINs answer the question you're asking on the database side (with temp tables) and only return the results. If you separate the queries so that a join is not needed and the query you execute first returns many results which must then be re-passed back to the SQL server, you're just making things worse. I can't think of any situations where using separate queries instead of a well-written join would actually be efficient when scaled up to 10's of 1000's of executions. Just thinking about it, what you'd be performing in the absence of a join would be what the join itself would be accomplishing with the added disadvantages that the SQL server had to send the results back to you first and then you would have to create a routine which would be at least more efficient than that employed by the DB itself (remember you have to overcome the receiving of data from the SQL server and needing to send the results of your processing back to the SQL server for the second query. If your processing is really that efficient then you should probably store the routine on the SQL server as a stored function for more efficient processing and then you don't need to overcome the delay of transmitting back-and-forth with the SQL server (or you could extend the SQL server, but that's a bit more challenging).
|
|
|
|
05-08-2007, 05:40 AM
|
Re: Speed for Join statement
|
Posts: 3,024
Name: Forrest Croce
Location: Seattle, WA
|
What were the circumstances where the multiple queries worked faster? That's possible, depending on how many rows are in each table, the expression your join is on ( you could join on a string pattern match to really hose things up ), how many rows come out of the queries ( individually and after being joined ), and how you cache data in the client application.
Without knowing all that, in general, the joins should be faster. It's one of the main things the database was meant to do. If you're using SQL Server, you're even taking advantage of optimizations in disc access, shared I/O among concurrent queries, and things like that.
|
|
|
|
05-08-2007, 06:19 AM
|
Re: Speed for Join statement
|
Posts: 2,055
Name: Thierry
Location: In the void
|
Quote:
|
There's more than one type of join and they each have performance benefits/costs depending on the query
|
Totally true...
And outer joins particulary come in my mind about that...
I have sometimes seen a speed up by breaking a large, complicated query (I'm talking about 30 tables here) into a lot smaller one, doing loop in PHP from the a somewhat large, but simplier query.
If you are looking for speed optimisation, take a closer look at the indexes posed on each tables.
They can do the difference between a slug and a rabbit....
Look in your dbms manual to find the syntax to display the query plan.
in mysql, its EXPLAIN added before the query.
This will make mysql run the query planner and explaining which index he could use, without doing the query.
In postgresql, you can use EXPLAIN too, and even go further by using
EXPLAIN ANALYZE {query}
It will run the query, and show you which path and how many time it took in each loop.
__________________
Listen to the ducky: "This is awesome!!!"
|
|
|
|
05-08-2007, 06:21 AM
|
Re: Speed for Join statement
|
Posts: 984
Name: Jeremy Miller
Location: Reno, NV
|
Quote:
Originally Posted by tripy
If you are looking for speed optimisation, take a closer look at the indexes posed on each tables.
|
Couldn't agree more.
|
|
|
|
05-08-2007, 07:32 AM
|
Re: Speed for Join statement
|
Posts: 522
Name: Gabe Solomon
Location: Romania
|
the real situation tested was this
i have a table with companies :
name - varchar
town - int
bank - int
type - int
so in my conding i have 4 separate function :
getinfo_company($id) // this calls on all other 3
getinfo_town($id)
getinfo_bank($id)
getinfo_type($id)
and each of those do a
SELECT * FROM `table_name` WHERE id='$id'
so when i have to get the info about a company i do 4 query's that seem to work faster then a single one ....
personaly i like the conding better this way ( spliting the code into 4 functions ) plus in each one ... i can add some more coding if i want to add more info about a town for example .. and it will afect every script that uses the town information
But after reading about how using join is better then using separate querry's i wanted to know what do you think
|
|
|
|
05-08-2007, 07:39 AM
|
Re: Speed for Join statement
|
Posts: 984
Name: Jeremy Miller
Location: Reno, NV
|
Assuming c.id is a primary key on a company table and all 4 queries are for a different table, can't you
Code:
SELECT c.company, tn.town, b.bank, t.type FROM table_company c, table_town tn, table_bank b, table_type t WHERE c.id=$id AND tn.id=c.id AND b.id=c.id AND t.id=c.id
That's using MySQL and assuming the $id has been sanitized as a number.
-- Oh, and you should only select * if absolutely necessary; otherwise, just select what you want.
|
|
|
|
05-08-2007, 07:54 AM
|
Re: Speed for Join statement
|
Posts: 522
Name: Gabe Solomon
Location: Romania
|
i know you can do that ....
but wich one is much faster ? or wich one is most recomended
until now ... my coding style was as i shown you before
do a function for every table .... and then call that function where ever i need it
that has some advantages ...
1. every tables data is extracted only in one place ... so if you add a new field or add a new virtual field that will be available every where in you're script
2. it makes more sense to me because i can do like a big array
eg : $company[user][subscribe][type][name]
that is the way i find out what the user subscription name is ... after caling the getinfo_company function .... wich would mean a LONG sql with join ( it goes trough 4 tables : company,users,subscriptions,subscriptions_type )
and that is only for the subscription part ... for all the company info ... i need to join like 10 tables
|
|
|
|
05-08-2007, 06:11 PM
|
Re: Speed for Join statement
|
Posts: 984
Name: Jeremy Miller
Location: Reno, NV
|
Quote:
Originally Posted by solomongaby
eg : $company[user][subscribe][type][name]
|
If you want that kind of depth, multiple functions is the only way to go afaik. I thought you wanted results which could be laid out in a table, but you effectively have a mulit-dimensional table so things don't go so well with SQL. Be wary of loading too much info into memory, however. If all the data for each user is loaded into memory and you have 500,000 users, you may have memory issues.
|
|
|
|
05-09-2007, 04:34 PM
|
Re: Speed for Join statement
|
Posts: 522
Name: Gabe Solomon
Location: Romania
|
well i never load all the users ... i always use a limit with my search
|
|
|
|
05-09-2007, 06:36 PM
|
Re: Speed for Join statement
|
Posts: 984
Name: Jeremy Miller
Location: Reno, NV
|
Quote:
Originally Posted by solomongaby
well i never load all the users ... i always use a limit with my search
|
That's not really what I meant - I mean that if a lot of people are visiting your site at once and you're loading a bunch of info for each person (even if it's just that person's data), then you risk memory issues. Does that make sense? I don't know what scale your site is on, but I've had to consider such things with some of the software I write.
|
|
|
|
05-09-2007, 06:51 PM
|
Re: Speed for Join statement
|
Posts: 522
Name: Gabe Solomon
Location: Romania
|
the website is going to have around 1000 visitors a day ... when it opens officialy .... and is thoght to grow to 5000 in the first 6 month
now it has an average of 100 ... and we dint even oficialy open it ... and didnt do any promoting what so ever 
|
|
|
|
05-09-2007, 06:54 PM
|
Re: Speed for Join statement
|
Posts: 984
Name: Jeremy Miller
Location: Reno, NV
|
Then it's likely that you'll be OK as the memory used is only the memory at the time the page is accessed, so with 1000 visitors per day, that's roughly 2/3 of a visitor per minute which gives plenty of time for the script to do it's thing, die, and the memory to be available again.
|
|
|
|
05-10-2007, 03:23 AM
|
Re: Speed for Join statement
|
Posts: 13,576
Location: Blackpool. UK
|
The number of visitors a day is really not that important, they might be spread throughout the entire 24 hours, only look at one page and leave. Then on the other hand, they might all arrive at once and view 4 or 5 pages.
page views per second/minute is the metric that will determine how fast server resources have to recover to ensure continuity of service. Your 1000 visitors a day could generate anywhere between 2 pv/minute and 1000 pv/second
In your traffic estimates have you allowed for search engine crawlers and "scrapers"?
A search engine crawler running at full speed (I'm assuming you will be using static looking URLs here), can potentially be hitting a site at 10 second intervals (data from my logs) over periods ranging from 2 to 3 minutes to a "deep crawl" of 30 minutes or more, and they are well behaved. The scrapers can be double that, and will continue at that pace until every page has been "viewed".
__________________
Chris. ->> Links are advertising NOT optimising!! <<-
Indifference will be the downfall of mankind, but who cares?
Code Samples | People Counting System
|
|
|
|
05-10-2007, 03:28 AM
|
Re: Speed for Join statement
|
Posts: 984
Name: Jeremy Miller
Location: Reno, NV
|
Nice clarification chrishirst. I did make an unstated assumption: roughly even distribution of visitors, or if not even, then not significantly greater than the average. Those assumptions would fail miserably (where 1000 pv/second may happen) if a special event were advertised at a particular time attracting such an audience at one point.
Instead of doing all the math though, it'd probably just be easier to check out your graphical stats :-) The math is very instructive though!
|
|
|
|
05-10-2007, 12:31 PM
|
Re: Speed for Join statement
|
Posts: 522
Name: Gabe Solomon
Location: Romania
|
I did take into consideration the crawler ...
And i had a google bot invasion once ... it had 300 session opened ....
and the page generation time rise by only 0.05 sec
thank you all for you're information.
|
|
|
| |