We all know we shouldn't be greedy and ask for more than we need or plan to use. That goes for rows and columns. Select * From Something is bad, and not having a where clause is one reason it's bad. ( Unless it's a small lookup table for a drop down box or something.)
Top N seems to be the preferred way to limit the output of a result set when we only want, say, 10 rows. Different databases have different implementations. In SQL Server, Top is a reserved word that immediately follows Select, and modifies how the query is run. Access is the same. MySQL isn't far off, it uses a Limit and optional Offset at the end of the query. Oracle gives a fake column called RowNum, that you can use from your Where clause. I can't speak for other databases.
SQL Server also has a SET ROWCOUNT N feature. Sometimes it's better and sometimes it's worse than TOP N. It also has a TABLESAMPLE feature. SELECT FirstName, LastName
FROM Person.Contact
TABLESAMPLE (10 PERCENT) The cool thing is that that query runs fast. Blazingly fast. There are limits, but you're telling the database you don't need true data that might find its way into a transaction, you want to see the table's data, just to look at. Which allows for more aggressive optimizations, which could appear to break referential immunity.
In some research, I found this
If you really want a random sample of individual rows, modify your query to filter out rows randomly, instead of using TABLESAMPLE. For example, the following query uses the NEWID function to return approximately one percent of the rows of the Sales.SalesOrderDetail table: SELECT * FROM Sales.SalesOrderDetail
WHERE 0.01 >= CAST(CHECKSUM(NEWID(), SalesOrderID) & 0x7fffffff AS float)
/ CAST (0x7fffffff AS int)
The SalesOrderID column is included in the CHECKSUM expression so that NEWID() evaluates once per row to achieve sampling on a per-row basis. The expression CAST(CHECKSUM(NEWID(), SalesOrderID) & 0x7fffffff AS float / CAST (0x7fffffff AS int) evaluates to a random float value between 0 and 1.
|