Reply
Parallel PHP Batch Apps in the Background...
Old 05-06-2008, 12:12 PM Parallel PHP Batch Apps in the Background...
Novice Talker

Posts: 7
Name: Shawn
Background:
Dedicated Server on Apache / PHP 5.x / MySql 4.x

What I am doing:
I need some suggestions on the best way to handle the following situation:

My application takes input from a XLS and imports the data into a MySQL database. When the import is done, each record has a lot of processing happen on it, typically 30-60 seconds worth. Most of that time is spent waiting for waiting for CURL requests from other services/servers.

It works fine, except when I get a lot of records...this process can take forever because each record is processed in order. Several hundred records
can take hours. However, the server is basically idle and waiting so I wanted a way to do a bunch of these in parallel.

So I set up a bunch of batch control processes and tables so the background procesess (batch.php) can run in parallel and not collide with records from the other processes. I can kick off a handful of these processes and everything works great.

The batch.php does all the work to get the next record, process it. It then looks for a "kill signal" from the database...if it sees it...it stops. If not, it grabs the next record. If there is no next record, it also stops.

So basically it runs until it runs out of records to work on.

The problem:
The problem however is when I kick off a lot of these processes (10-30). While I am in the process of starting the other batches, the server "hangs" for a few seconds while each process starts and the processes that were already running see the server stop and they all stop as well.

I'm kicking off the processes in Javascript with a button on a page "Add 30 processes". That button will loop 30 times...each time it calls a XMLHttpRequest to "batch.php". This call essentially starts a instance of the batch program on the server in the background.

What I think is happening:
I think when I kick off this many processes so quickly the server chokes trying to catch up...it's like having 30+ concurrent users (plus whomever else is using the server).

What I think is needed:
I would like to find a way for a PHP app (controller.php) be able to kick off "x" number of batch processes (batch.php) in parallel similar to how the javascript currently does. This way, I can have the "controller.php" look at the server and pace the new processes so the problem doesn't happen.

Thoughts?
sbritton is offline
Reply With Quote
View Public Profile
 
When You Register, These Ads Go Away!
Old 05-06-2008, 12:36 PM Re: Parallel PHP Batch Apps in the Background...
VirtuosiMedia's Avatar
Webmaster Talker

Posts: 735
Does this help?
VirtuosiMedia is offline
Reply With Quote
View Public Profile Visit VirtuosiMedia's homepage!
 
Old 05-06-2008, 12:53 PM Re: Parallel PHP Batch Apps in the Background...
Novice Talker

Posts: 7
Name: Shawn
I had already looked at the "pcntl_fork" feature, but understood that it can't be used as an apache module...only when running from a true shell/batch command.

Now that may not be entirely accurate, but looking at all of the warnings and suggestions against using it...I'm a little concerned putting something like that into a production environment when very few people (most importantly me) understand what it is.

Surely there is a way to initiate a async/background/parallel HTTP request (http://www.server.com/batch.php) or even a shell "wget http://www.sever.som/batch.php" from native PHP...isn't there?

I guess worst case, I can setup a cron job to run every minute. It counts the number of active batch jobs (as reported by my control table). If there are less than "x" running processes, it starts another one. This means it would take up to 30 minutes to fully spool up a batch. Too bad I can't do cron jobs more frequently than every minute.
sbritton is offline
Reply With Quote
View Public Profile
 
Old 05-06-2008, 01:04 PM Re: Parallel PHP Batch Apps in the Background...
Novice Talker

Posts: 7
Name: Shawn
I think I have an idea I need to test.

I could have a controller.php that is called from the web page's "start batch" button.

That program then does the following:

-------controller.php-----------
while ($num_parallel < 30 or $done) {
curl ("batch.php") //This is the individual threads
curl timeout = 1 // Set a low timeout because we don't care if it is running
execute curl

look for $done flag from database //See if user wants to kill everything

count # of batch.php via control_table
}


---------batch.php--------------
count # of batch.php via control_table

while ($rec_unprocessed > 0 and $done = 'n' and #batch<30) {
get next record
process record
update control table
}

What the controller will do is loop itself, then blindly call the batch process via curl, but timeout after one second, because we don't care what it is doing...and are not getting back a result. We are just using the curl function to call something else in the background without waiting for it to finish.
sbritton is offline
Reply With Quote
View Public Profile
 
Old 05-06-2008, 01:17 PM Re: Parallel PHP Batch Apps in the Background...
VirtuosiMedia's Avatar
Webmaster Talker

Posts: 735
I was going to suggest an OOP approach, but you just posted something similar.
VirtuosiMedia is offline
Reply With Quote
View Public Profile Visit VirtuosiMedia's homepage!
 
Old 05-06-2008, 01:27 PM Re: Parallel PHP Batch Apps in the Background...
addonchat's Avatar
Skilled Talker

Posts: 97
Name: Chris Duerr
A thread would be preferable, but as far as I know PHP doesn't support it, and there's no real reason it should.

If you have to do it in PHP, Virtuosi gave you what you need. Your controller will run as a daemon that forks off children, but you might only want your forked processes to manage the bottleneck (waiting on web services and the like) and let the child handle that, then communicate it back to the parent using SysV IPC (http://us2.php.net/manual/en/ref.sem.php) or similar, so you don't have to worry about database conflicts. Remember to set-up signal handlers (http://us.php.net/pcntl_signal) as well, and to keep tabs on both the amount of processes you're spinning off and available system memory -- and don't cycle the CPU like mad, find a PHP equivalent of the wait() system call.

You didn't mention how you're currently kicking of processes, but the cool thing about fork is that the OS doesn't need to allocate more memory for the entire PHP application, it just allocates more memory for a new IP and whatever data your app is creating. The downside is that the php app is an interpreter with a lot of overhead, so fork isn't nearly as appealing as it would be on a small C app, for instance.

This is a much better job for Java or C/C++. Java interprets bytecode, but will have all the libraries you need ready to go, and you could use threads instead of dealing with forking, and IPC -- it' should be much faster.

If it's possible, caching web service results for a short duration may also help.
__________________
Chris Duerr
AddonChat Java Chat Software
http://www.addonchat.com/ - Affiliate Program
addonchat is offline
Reply With Quote
View Public Profile
 
Old 05-06-2008, 01:33 PM Re: Parallel PHP Batch Apps in the Background...
addonchat's Avatar
Skilled Talker

Posts: 97
Name: Chris Duerr
Oops -- sorry I was putting my reply together while you were responding I guess

If you don't care about the output from the web service and need a bit more speed, just use a port 80/8080/443 etc.. socket connection (non-blocking, TCP) and send a manual HTTP GET query.
__________________
Chris Duerr
AddonChat Java Chat Software
http://www.addonchat.com/ - Affiliate Program
addonchat is offline
Reply With Quote
View Public Profile
 
Old 05-06-2008, 02:06 PM Re: Parallel PHP Batch Apps in the Background...
VirtuosiMedia's Avatar
Webmaster Talker

Posts: 735
You might also find this article useful.
VirtuosiMedia is offline
Reply With Quote
View Public Profile Visit VirtuosiMedia's homepage!
 
Old 05-06-2008, 02:30 PM Re: Parallel PHP Batch Apps in the Background...
Novice Talker

Posts: 7
Name: Shawn
I tried the following and seems to work pretty good:

Starting controller<br>
<?php
include("inc_database.php");
$procCount = 0;
$kill = "n";

while ($procCount < 15 and $kill=="n") {

$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, "http://www.???.com/batch.php");
curl_setopt($ch, CURLOPT_TIMEOUT, 3);
$response=curl_exec ($ch);
curl_close ($ch);

echo "---starting a batch---<br>";
//Count the number of processes
$query = "select * from batch_control";
$result = mysql_query($query);
$procCount = mysql_num_rows($result);

//See if there is a batch to process
$query = "select * from batch where batch_status in ('Pending','Processing')";
$result = mysql_query($query);
if (mysql_num_rows($result) == 0) { $kill = "y"; }

}
?>

Ending Controller<br>
procCount:<?php echo $procCount; ?><br>
kill:<?php echo $kill; ?><br>
sbritton is offline
Reply With Quote
View Public Profile
 
Old 05-06-2008, 02:49 PM Re: Parallel PHP Batch Apps in the Background...
addonchat's Avatar
Skilled Talker

Posts: 97
Name: Chris Duerr
The only disadvantage to taking that route is that you're putting unnecessary stress on your web server, and opening up unnecessary sockets. FYI most web servers simply fork themselves to handle new requests. On a popular site, you'll saturate your web server. It's best to run it is a watchdog'd daemon in the background, and fork/thread off anything that isn't CPU/memory hungry but requires a lot of run time. Just my two cents though
__________________
Chris Duerr
AddonChat Java Chat Software
http://www.addonchat.com/ - Affiliate Program
addonchat is offline
Reply With Quote
View Public Profile
 
Old 05-06-2008, 03:35 PM Re: Parallel PHP Batch Apps in the Background...
Novice Talker

Posts: 7
Name: Shawn
Does it matter if the children processes are fully self-sufficient? In otherwords they do not want/need to communicate to the calling program.

They know if they should start or not...and if/when they should stop.

They will run basically forever until all of the work is done. So I can't have whatever calls them wait around for a response.

Using sockets...can I open (and therefore start the process), then close even though nothing came back?

Using the forks...I don't want the parent program to remain open to deal with shutting down the children or dealing with them at all.

Is there another way to execute a shell command, that doesn't wait for a response so I can start the batch.php without impacting the webserver via http?

In reality...I want to have the "children processes" and "send them off to college" immediately, then change my phone number so they never call asking for money. *smile*
sbritton is offline
Reply With Quote
View Public Profile
 
Old 05-06-2008, 03:46 PM Re: Parallel PHP Batch Apps in the Background...
addonchat's Avatar
Skilled Talker

Posts: 97
Name: Chris Duerr
I was suggesting sockets as an alternative to CURL, having misunderstood your original solution -- so don't worry about that.

For what you're doing, you don't want to use fork, exec, system, etc.. from a PHP script running within the web server.

Typically, when you kill the parent process you want all the children killed too. It's just the responsible thing to do -- which is why you'll want to at least catch SIGHUP/SIGKILL/SIGTERM, etc..

Since you don't have to worry about the batch.php program communicating back to the parent, no need to worry about IPC

Write the program, and run it in CLI mode; probably as a startup script or something. Basically, where you're making your curl call above, you'd replace it with a fork to call something like '/location/to/php /location/to/script'
__________________
Chris Duerr
AddonChat Java Chat Software
http://www.addonchat.com/ - Affiliate Program
addonchat is offline
Reply With Quote
View Public Profile
 
Old 06-20-2008, 04:16 AM Re: Parallel PHP Batch Apps in the Background...
Junior Talker

Posts: 1
You can do parallel curl calls natively in PHP.

http://www.jaisenmathai.com/blog/200...hp-multi_curl/
jmathai is offline
Reply With Quote
View Public Profile
 
Old 06-20-2008, 02:31 PM Re: Parallel PHP Batch Apps in the Background...
Learning Newbie's Avatar
Moderator

Latest Blog Post:
My Wish for Webmaster Talk
Posts: 5,181
Name: John Alexander
ASP.NET's multi threading support would make short work of this.
__________________
4 ways to improve the lives of the "bottom billion"

"HEY YOU KIDS GET OFF MY LAWN!" -John McCain
Learning Newbie is offline
Reply With Quote
View Public Profile
 
Reply     « Reply to Parallel PHP Batch Apps in the Background...
 

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are Off
Pingbacks are Off
Refbacks are Off




   
RSS Feed  Feeds: RSS   JS   XML
RSS Feed  Feeds for this forum: RSS   JS   XML

 


Page generated in 0.18466 seconds with 12 queries