0%

A Simple PowerShell Script for Parallel Tasks

I recently needed to install software and send commands to more than a thousand servers.

The machines were already split into dozens of groups, with one deployment node in each group acting as a relay. Even so, I was still sending commands to those deployment nodes one by one from my local machine and waiting for each result, which wasted a lot of time.

At that point it became obvious that once deployment reaches a certain scale, serial execution is just too slow. So I started experimenting with parallel task execution in PowerShell.

PowerShell provides Start-Job, which launches a background job in the current session. Once a job is started, it runs independently and does not block the current shell. Even long-running work can continue quietly in the background.

You can inspect job state with Get-Job, wait for completion with Wait-Job, and retrieve output with Receive-Job.

Based on that, I wrote the following simple script:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
# define 6 tasks
$script_block_1 = {sleep 10}
$script_block_2 = {sleep 16}
$script_block_3 = {sleep 8}
$script_block_4 = {sleep 13}
$script_block_5 = {sleep 7}
$script_block_6 = {sleep 10}

# task queue
$script_array = $script_block_1, $script_block_2, $script_block_3,
$script_block_4, $script_block_5, $script_block_6

# number of parallel jobs
$parallel_count = 1

# start timing
$start_time = (Get-Date)

# remove existing background jobs in this session
Remove-Job *

$total_task_count = $script_array.Length
$init_task_count = $parallel_count

if($init_task_count -gt $total_task_count)
{
$init_task_count = $total_task_count
}

foreach($i in 1..$init_task_count)
{
Start-Job $script_array[$i - 1] -Name "parallel_job_$i"
}

$next_index = $init_task_count

while($next_index -lt $total_task_count)
{
for($i = 1; $i -le $init_task_count; $i++)
{
$state = [string](Get-Job "parallel_job_$i").state
if($state -eq "Completed")
{
Remove-Job "parallel_job_$i"
Start-Job $script_array[$next_index] -Name "parallel_job_$i"
$next_index++
}

if($next_index -ge $total_task_count)
{
break
}
}
}

Get-Job | Wait-Job
"All jobs done!"

(New-TimeSpan $start_time).TotalSeconds

At the top, the script defines six separate script blocks. For testing, each one just sleeps for a different amount of time.

These blocks are placed in $script_array as the task queue. $parallel_count controls how many jobs may run at the same time, and it should not be set to 0.

Since different tasks finish at different times, the script polls the running jobs. As soon as one finishes, it immediately pulls the next task from the queue and launches it, so task slots do not sit idle.

I measured the total runtime with different parallelism settings:

parallel count = 1: 70.0920091 seconds
parallel count = 3: 29.3396782 seconds
parallel count = 6: 21.4452266 seconds

So yes, parallel execution really does reduce the total time quite a bit.

This was still a rough test script, and I had not yet used it at large scale at the time. In real production use, more edge cases would surely show up.

如果我的文字帮到了您,那么可不可以请我喝罐可乐?