![]() |
|
|
How many people batch updates? |
|
Moderator
![]()
Latest Blog Post:
My Favorite Isaac Asimov Story Posts: 4,070
Name: John Alexander
|
Say you need to update each of 1,000,000 rows in a table. Do you do one big update, or break it into batches? How do you decide?
I'm just curious how other programmers tend to approach this situation?
__________________
HungarianNotation is the last resort of scoundrels. Why not the first resort? That's where it counts! |
|
|
|
| Sponsored Links (We share ad revenue): |
|
|
Re: How many people batch updates? |
|
Average Talker
Posts: 21
|
well it depends upon what update you are trying to do.
Like if you want to update single field with a new common value for 10,000 rows then I prefer batch update. OR if the update is for some group of rows i.e. 100 rows will have certain value say 1 and another 400 rows will have value say 2 then offcourse we will need to split the rows and perform update
__________________
What is FTP and how to use FTP Client? - Read Article Google Analytics Interface Tutorial - Watch Video |
|
|
|
|
|
Re: How many people batch updates? |
|
Fetchez la vache!
Latest Blog Post:
Pretty pretty please…. Posts: 1,689
Name: Thierry
Location: In the void
|
For me, it's the dev/prod server that trigger this choice.
I must sometimes do that kind of maintenance on production server. The kind of server that must (should...) have 100% uptime, medical related for a distributed grid. I cannot allow to row-lock a big table for a long time, and delay every other operations. So, when this arise: 1) I do this between 2 and 5 am (because that's a open window between several maintenance automated operations) 2) I split the job in batch/transactions to give more granularity in the lock time 3) I pray a lot that everything runs ok...
__________________
Listen to the ducky: "This is awesome!!!" |
|
|
|
|
|
Re: How many people batch updates? |
|
Moderator
![]()
Latest Blog Post:
My Favorite Isaac Asimov Story Posts: 4,070
Name: John Alexander
|
So I guess I could have been more clear. In SQL Server, you can send a lot of different SQL commands as part of a batch. But this isn't what I'm asking.
UPDATE dbo.BigTable SET Column1 = UDF(?), Column2 = UDF(?), etc Or SET ROWCOUNT 100000 While Exists (SELECT Column1 FROM dbo.BigTable WHERE Column1 != UDF(?), Column2 != UDF(?)) Begin UPDATE dbo.BigTable SET Column1 = UDF(?), Column2 = UDF(?), etc End The first script, just a simple update, will run over the entire table. The second will, also, but only 1/10th at a time. The advantage of this is that while the update is running, you have no idea how far along the way it is. Should it take 10 hours, after you've waited 9 of them, you don't really know if it'll be another 10 hours, or another 10 minutes. Breaking the code up into granular batches gives you more info into what's going on, at the expense of forcing more work on the server. For example, in the second script, the server will have to query the table for the existence of a particular condition 10 times. Each of these could trigger a table scan! But as you say, it lets us take down clients for a lesser time. Somewhat more downtime over all, but in shorter, less painful increments.
__________________
HungarianNotation is the last resort of scoundrels. Why not the first resort? That's where it counts! |
|
|
|
|
|
Re: How many people batch updates? |
|
Ultra Talker
Latest Blog Post:
Confessions and Reflections of a [Former] ... Posts: 433
Name: Tamar Weinberg
Location: New York
|
I'm not a heavy programmer, but when I have to do something like this, why would I break it into chunks unless it's an issue of server resources? I'd typically do a test run with 1-2 rows and see if it produces the desired results, and if it does, I'll just do the entire thing. No need to make the update more complicated than it has to be.
__________________
Techipedia: All About Social Media |
|
|
|
|
|
Re: How many people batch updates? |
|
Moderator
![]()
Latest Blog Post:
My Favorite Isaac Asimov Story Posts: 4,070
Name: John Alexander
|
I guess that's the question, Tamar. When does the added, strictly unnecessary complexity pay off?
And how does time factor into that? Suppose you were expecting the update command to run for 18 hours - just as a very extreme example. I'd want to batch things in that case, as long as it were possible to bring the system back up afterwards with only part of the update completed. Oracle doesn't do this, but MSSQL and MySQL do. They'll take locks at the row level, so if I've just updated a particular row, no one else can touch that row until my transaction finishes. That's not bad - out of 1m, to have 1 locked is nothing. Nobody else will probably want to touch it. But, when I change the 2nd row, now I have 2 rows locked. Pretty soon it's 100, then 1,000. You see where this is going. Well, eventually, the database gets to the point where it's using too much memory locking individual rows, and escalates the lock. Now we've got range locks, if they can apply, if not or if they do for a while, we soon escalate again until the entire table is locked. Which means no one else can access it until my transaction commits. And in a case like that, normally the business rules prevail. There are such and such requirements, there's a time budget for how long things can be taken off line. But this isn't always the case, and quite often we don't have a reliable way to know just how long something will run. Worse, the business people trust the tech people. Quite often, we don't have a budget. The bosses will say to us "You know our goals and priorities, and you know better than anyone how the system works. Do whatever you think is best for the system."
__________________
HungarianNotation is the last resort of scoundrels. Why not the first resort? That's where it counts! |
|
|
|
|
|
Re: How many people batch updates? | |
|
Ultra Talker
Latest Blog Post:
Confessions and Reflections of a [Former] ... Posts: 433
Name: Tamar Weinberg
Location: New York
|
Quote:
I cannot even fathom myself to be in a situation like that!
__________________
Techipedia: All About Social Media |
|
|
|
|
|
|
Re: How many people batch updates? |
|
Moderator
![]()
Latest Blog Post:
My Favorite Isaac Asimov Story Posts: 4,070
Name: John Alexander
|
Well, by strictly unnecessary, I mean that it isn't strictly necessary. Or that you could get by without it. But I'd also say that it's not strictly necessary to use a database at all when flat files have worked so well since the 1960s, but that we do get some benefits from SQL.
That doesn't mean I'm in the same boat as a relational database, just an example.
__________________
HungarianNotation is the last resort of scoundrels. Why not the first resort? That's where it counts! |
|
|
|
| Sponsored Links (We share ad revenue): |
| Thread Tools | |
|
|
| Webmaster Resources Marketplace: |
| Software Development Company | Webhosting.UK.com |
| Web Templates | Text Link Brokers | Stock Photos |