Bentley Communities
Bentley Communities
  • Site
  • User
  • Site
  • Search
  • User
ProjectWise
  • Product Communities
ProjectWise
ProjectWise PowerShell Extensions Forum Multi-threading or Using Jobs
    • Sign In

    • State Not Answered
    • Replies 10 replies
    • Subscribers 66 subscribers
    • Views 340 views
    • Users 0 members are here
    • powershell
    • ProjectWise
    • pwps_dab

    Multi-threading or Using Jobs

    Brian Flaherty
    Offline Brian Flaherty 1 month ago

    Hello All,

    I have a script which loops through multiple datasources to gather data for each, populate a datatable and export to an Excel file. Simple enough.

    My question is, has anyone tried to using either some type of multi-threading or jobs, to simultaneously process multiple datasources?  Hope this makes sense.

    I am trying to cut the amount of time to run the script and gather the data. Thanks in advance.

    Cheers,

    Brian

    • Sign in to reply
    • Cancel

    Top Replies

    • Robert McMillan
      Offline Robert McMillan Wed, Oct 18 2023 8:18 PM +2
      You could also look at PowerShell Workflows with the -parallel & -throttlelimit switches. Here's something I did with running multiple scanrefs.exe instances in parallel. Workflow Scan-ReferencesMultithreaded…
    • Robert McMillan
      Offline Robert McMillan Thu, Oct 19 2023 6:33 PM in reply to Kevin van Haaren +2
      I'm using PowerShell 5.1 as well and I think the -parallel switch is only available within the context of a Workflow function. This wasn't specifically using PWPS_DAB cmdlets and I think with Scope issues…
    • Kevin van Haaren
      0 Online Kevin van Haaren Wed, Oct 18 2023 2:10 PM

      I used jobs a long time ago to do something similar, but i'm not currently doing it (not using the script currently). There were a bunch of tricks I had to figure out to get it to work.

      One was to make the script that does the steps you want to do on each datasource a standalone script that accepts parameters. Then use a 2nd script to spawn the jobs with the appropriate parameters. Trying to do it with functions or code blocks made my brain hurt.

      It was also tricky getting data back. If I remember correctly I basically saved the data into an array then processed all the returned data after the jobs were done. Having multiple jobs attempt to write to the same excel file was bad (even different sheets I believe).

      Also I found it was a good idea to spawn a limited number of jobs in parallel and then watch the job queue for jobs to finish before starting new ones.

      Let me see if I can find the actual code.

       

      • Cancel
      • Vote Up 0 Vote Down
      • Sign in to reply
      • Verify Answer
      • Cancel
    • Kevin van Haaren
      0 Online Kevin van Haaren Wed, Oct 18 2023 2:11 PM in reply to Kevin van Haaren

      Oh yeah, scoping was a whole deal both on passing parameters and getting data back.

       

      • Cancel
      • Vote Up 0 Vote Down
      • Sign in to reply
      • Verify Answer
      • Cancel
    • Kevin van Haaren
      0 Online Kevin van Haaren Wed, Oct 18 2023 3:09 PM in reply to Kevin van Haaren

      OK, I found my script but the way I was running it was due to our database server configuration. We have multiple database servers with multiple datasources on them rather than one big load-balanced instance. So I would launch my jobs so their was one job per database server and that job processed all the datasources on that server synchronously while other jobs did the same for other database servers. This meant that no more than one database statistics update was taking place per database server at a time.

      Let me massage this a bit and I think it'll be more understandable (i also have to strip a bunch of junk out, i haven't touched these since 2021)

       

      • Cancel
      • Vote Up 0 Vote Down
      • Sign in to reply
      • Verify Answer
      • Cancel
    • Brian Flaherty
      0 Offline Brian Flaherty Wed, Oct 18 2023 3:57 PM in reply to Kevin van Haaren

      I figured you'd be the one to have done something with this before.  I have been able to get multiple jobs to run and return data, which totally made my brain hurt.. Now I am trying to figure out how to throttle the number of concurrent jobs running.   Thanks.

      • Cancel
      • Vote Up 0 Vote Down
      • Sign in to reply
      • Verify Answer
      • Cancel
    • Kevin van Haaren
      0 Online Kevin van Haaren Wed, Oct 18 2023 4:40 PM in reply to Brian Flaherty

      Ah, this is roughly how I did it (this code has never been run so it probably has errors): I have an array of items I want to process, and a maximum number i want to run at a time. As I launch jobs I add the job info to an array. When the size of the array is the maximum number of jobs, i switch to polling current jobs until they complete or crash. As they complete i save the results and then remove the job info from the array (which opens another slot to start a new job...)

      $dsList = @(
      	'server:dsource1'
      	'server:dsource2'
      	'server:dsource3'
      	'server:dsource4'
      	'server:dsource5'
      	'server:dsource6'
      )
      
      $maxJobs = 3
      $dsIdx = 0
      $sleepSecs = 5
      $jobList = [System.Collections.ArrayList]@()
      $results = [System.Collections.ArrayList]@()
      While ($dsIdx -le $dsList.GetUpperBound(0)) {
      	# not enough jobs launched yet?
      	if ($jobList.count -lt $maxJobs) {
      		$ds = $dsList[$dsIdx]
      		$jobSplat = @{
      			Name         = $ds
      			FilePath     = '\path\to\script.ps1'
      			ArgumentList = "-datasource $($ds)"
      		}
      		Write-Host "Starting job $($jobList.count + 1): $ds"
      		$jobList.Add((Start-Job @jobSplat))
      		$dsIdx++
      	} else {
      		# Check job list for finished/crashed jobs
      		ForEach ($j in $jobList) {
      			$curState = (Get-Job -Id $j.ID).State
      			Switch ($curState) {
      				"Running" { break }              # break exits switch
      				"Completed" {
      					# add job results to array list
      					[void]$results.Add((Remove-Job -Id $j.ID))
      					# remove completed job from job list
      					[void]$jobList.Remove($j)
      					break
      				}
      				default {
      					$err = Receive-Job -ID $j.ID
      					Write-Error "Error occurred in job $($j.ID): $($err)"
      					[void](Remove-Job -ID $j.ID)
      					[void]$jobList.Remove($j)
      					break
      				}
      			}
      		}
      	}
      	# pause to give jobs time to run
      	[void](Start-Sleep -Seconds $sleepSecs)
      }
      

       

      • Cancel
      • Vote Up 0 Vote Down
      • Sign in to reply
      • Verify Answer
      • Cancel
    >

    Communities
    • Home
    • Getting Started
    • Community Central
    • Products
    • Support
    • Secure File Upload
    • Feedback
    Support and Services
    • Home
    • Product Support
    • Downloads
    • Subscription Services Portal
    Training and Learning
    • Home
    • About Bentley Institute
    • My Learning History
    • Reference Books
    Social Media
    •    LinkedIn
    •    Facebook
    •    Twitter
    •    YouTube
    •    RSS Feed
    •    Email

    © 2023 Bentley Systems, Incorporated  |  Contact Us  |  Privacy |  Terms of Use  |  Cookies