wiki:u/erica/CFRunStatsHighRes

Version 59 (modified by Erica Kaminski, 10 years ago) ( diff )

MHD SHEAR RUNS

5 Production Runs:

Tfinal = 20 Myr, nframes = 200, maxlevel = 5

Beta Shear Angle Purpose Status
10 0 Compare with Jonathan's work running on stampede
10 15 Compare with Christina's work complete
10 30 "" complete
10 60 "" complete
1 60 Compare a strong field complete

Resolution Studies:

Was thinking of doing a convergence study for the Shear 60 case so I could see if there was a convergence effect for the different field strengths?

Something like 4 little resolution tests out to frame 50:

Beta Shear Angle maxlevel Status
10 60 4 gives error
10 60 6 25/200
1 60 4 gives error
1 60 6 23/200

But since maxlevel = 4 is being buggy, I was thinking I could just run 2 small convergence runs on the Beta = 10, Shear 15 case:

Beta Shear Angle maxlevel Status
10 15 4 23/200
10 15 6 not yet started

Here is a table of run stats for help in monitoring jobs:

Beta Shear Machine, Directory Frame Filling Fracs Walltime left (predicted and queued remaining) Info allocs, Message allocs No. cells finest Framerate (mins/frame), no. of cores Notes
1 15 BS, /scratch/ekamins2/MHD_Shear/ Beta1/xfield/Shear15/Restart_ResJune05_2014/ Less_Levels/Trying_New_Frame_Copy/Test_Same_Levels_As_Orig restart from frame = 56 code prints hypre errors after restarting files from bamboo using a different number of levels. didn't get these errors when I restarted using same number of levels.
1 60 BS, /scratch/ekamins2/MHD_Shear/Beta1/xfield/Shear60 67/200 0.087 0.541 0.615 0.243 0.286 9days/0days 265.2 gb 104.5 mb, 128.0 mb 19,440,450 1.2hr/frame, 4096 cores Hit time limit
BS, /scratch/ekamins2/MHD_Shear/Beta1/xfield/Shear60/Restart 86/200 0.096 0.537 0.628 0.295 0.402 13days/0days 240 gb 167 mb, 64 mb 37,101,854 3hr/frame, 2048 cores Hit time limit
BS, /scratch/ekamins2/MHD_Shear/Beta1/xfield/Shear60/Restart/Restart 89/200 0.097 0.538 0.628 0.298 0.411 12days/0days 243 gb 168 mb, 64 mb ~37,101,854 3hr/frame, 2048 cores Hit time limit
BS, /scratch/ekamins2/MHD_Shear/Beta1/xfield/Shear60/Restart/Restart/Restart 93/200 0.098 0.539 0.625 0.304 0.435 12.5days/0days 250.8 gb, 178.9 mb, 64.0 mb Files taken off of BS (07/28)
BS, /scratch/ekamins2/MHD_Shear/Beta1/xfield/Shear60/Restart/Restart/Restart/Restart 102/200 0.101 0.543 0.631 0.314 0.459 17.5days/0days 267.4 gb 184.0 mb, 64.0 mb 3hr/frame, 2048 cores
BS, /scratch/ekamins2/MHD_Shear/Beta1/xfield/Shear60/Restart/Restart/Restart/Restart/Restart 106/200 0.102 0.549 0.627 0.313 0.458 1.3mo/0 263.9 gb 328.8 mb, 64.0 mb 4hr/frame, 2048 cores
BS, /scratch/madams15/CollidingFlows/Beta1/Shear60/ 112/200 0.105 0.553 0.621 0.312 0.469 13.5days/0days 271.6 gb, 193.1 mb, 64.0 mb ~4.5hr/frame, 2048 cores Reservation. Testing what node fraction would be most efficient. Stayed with 128.
BS, /scratch/madams15/CollidingFlows/Beta1/Shear60/Restart/ 122/200 0.106 0.559 0.631 0.321 0.467 16.8days/0days 282.5 gb, 197.6 mb, 64.0 mb ~5hr/frame, 2048 cores Reservation. Stayed with 128 nodes. Started with 48 hours left on reservation.
BS, /scratch/madams15/CollidingFlows/Beta1/Shear60/Restart/Restart 130/200 0.109 0.558 0.640 0.328 0.455 16.1days/0days 288.3 gb, 199.5 mb, 64.0 mb ~5hours/frame, 2048 cores Previous reservation got extended. Stayed with 128 nodes. Ran for 48 hours until it went over the soft limit and BS killed the job.
BS, /scratch/madams15/CollidingFlows/Beta1/Shear60/Restart/Restart/Restart/ 142/200 0.106 0.576 0.635 0.312 0.450 8.8days/0days 381.9 gb, 138.9 mb, 128.0 mb ~4hours/frame, 4096 cores On reservation. madams15-20140807 starting on Thursday at 8AM for 5 days.ReservationName=madams15-20140807 StartTime=2014-08-07T08:00:00 EndTime=2014-08-12T08:00:00 Duration=5-00:00:00 Nodes=bg0000 NodeCnt=512 CoreCnt=8192 Features (null) PartitionName=standard Flags= Users=ekamins2,madams15 Accounts=(null) Licenses=(null) State=INACTIVE
BS, /scratch/madams15/CollidingFlows/Beta1/Shear60/Restart/Restart/Restart/Restart/ 148/200 0.114 0.575 0.638 0.344 0.479 10days/0days 310.2gb, 213.3 mb, 64.0mb ~5hours/frame, 2048 cores On reservation.
BS, /scratch/madams15/CollidingFlows/Beta1/Shear60/Restart/Restart/Restart/Restart/Restart TBA/200 TBA TBA TBA TBA, 4096 cores On reservation.
10 15 BS, /scratch/ekamins2/MHD_Shear/Beta10/Shear15 36/200 0.142 0.547 0.575 0.183 0.367 8days/0days 233.6 gb 58.7 mb, 192.0 mb 9,058,432 50 mins/frame on 8192 cores Reservation
BS, /scratch/ekamins2/MHD_Shear/Beta10/Shear15/Restart 43/200 0.155 0.546 0.565 0.214 0.375 7days/0days 259.3 gb 62.6 mb, 192.0 mb 11,588,046 1hr/frame, 8192 cores reservation
BS, /scratch/ekamins2/MHD_Shear/Beta10/Shear15/Restart/Restart 57/200 0.186 0.573 0.533 0.266 0.388 7days/0days 315.9 gb 68.6 mb, 192.0 mb 17,705,187 1hr/frame, 8192 cores Non-threaded, non-global partition Hypre, reservation
BS, /scratch/ekamins2/MHD_Shear/Beta10/Shear15/ Restart/Restart/Restart_With_Threading 69/200 0.228 0.582 0.548 0.356 0.450 11days/0days 221.7 gb 155.2 mb, 64.0 mb 35,179,794 2hr/frame,2048 cores Standard queue (2048 cores). Eta. was originally 6/18/2014, but actually started on 6/23, 3 AM. Copied non-threaded version of astrobear into the run directory and named it threaded to match the submitted batch script, because found the threaded version of the code threw errors
BS, /scratch/ekamins2/MHD_Shear/Beta10/Shear15/ Restart/Restart/Restart_With_Threading/Restart 83/200 0.255 0.612 0.562 0.36 0.418 14 days/0days 326 gb 122 mb, 128.0 mb 39,856,494 2.5hr/frame,4096 cores reservation
BS, /scratch/ekamins2/MHD_Shear/Beta10/Shear15/ Restart/Restart/Restart_With_Threading/Restart/Restart 94/200 0.277 0.635 0.571 0.391 0.407 10 days/0days 362 gb 131 mb, 128.0 mb 48,267,318 2.5hr/frame,4096 cores reservation, disk quota exceeded
BS, /scratch/ekamins2/MHD_Shear/Beta10/Shear15/ Restart/Restart/Restart_With_Threading/Restart/Restart/Restart 95/200 .279 .638 .57 .395 .409 13 days/0days 365 gb 143 mb, 128.0 mb 49,500,836 NA NA
BS, /scratch/ekamins2/MHD_Shear/Beta10/Shear15/Restart/Restart/Restart_With_Threading/Restart/Restart/Restart/Restart 105/200 .303 .647 .581 .414 .403 10 days/0days 400 gb 148 mb, 128.0 mb 57,387,965 3hr/frame, 4096 cores Disk quota exceeded, files are ~9 GB each at this point
BS, /scratch/ekamins2/MHD_Shear/Beta10/Shear15/Restart/Restart/Restart_With_Threading/Restart/Restart/Restart/Restart/Restart/Restart 109/200 .325 .663 .596 .465 .436 17.5 days/0days 344 gb 236 mb, 64 mb 78,627,595 5hr/frame, 2048 cores Still running, size of files = 13 GB
BS, /scratch/ekamins2/MHD_Shear/Beta10/Shear15/Restart/Restart/Restart_With_Threading/Restart/Restart/Restart/Restart/Restart/Restart/Restart/ 119/200 0.344 0.675 0.610 0.483 0.438 18.3days/0days 372.7 gb, 250.9 mb, 64.0 mb 5hr/frame, 2048 cores Files taken off of BS (07/28)
BS, /scratch/ekamins2/MHD_Shear/Beta10/Shear15/Restart/Restart/Restart_With_Threading/Restart/Restart/Restart/Restart/Restart/Restart/Restart/Restart 120/200 0.344 0.680 0.614 0.485 0.436 24.4days/0days 374.3 gb, 259.0 mb, 64.0 mb 10hr/frame, 2048 cores Catching chombos for Marissa's runs.
BS, /scratch/madams15/CollidingFlows/Beta10/Shear15 133/200 0.367 0.670 0.565 0.406 0.388 6days/0days 592.6 gb, 116.2 mb, 192.0 mb ~3-4hr/frame, 2048 cores, ~2hr/frame, 8192 cores Reservation. Testing what node fraction would be most efficient. Stayed with 128 for 07/24, then bumped up to 256 07/25, then finally 512 for another 24 hours.
BS, /scratch/madams15/CollidingFlows/Beta10/Shear15/Restart 139/200 0.386 0.698 0.616 0.496 0.468 22.1days/0days 422.4 gb, 296.5 mb, 64.0 mb 7hr/frame, 2048 cores Restarted with 128 nodes on 07/26.
BS, /scratch/madams15/CollidingFlows/Beta10/Shear15/Restart/Restart/ 144/200 0.398 0.695 0.617 0.499 0.472 20.4days/0days 432.4 gb, 290.9 mb, 66.1 mb ~9hr/frame, 2048 cores Previous reservation got extended until 08/04 at 08:00:00. Stayed with 128 nodes. Job ran for 48 hours until memory went over soft limit.
BS, /scratch/madams15/CollidingFlows/Beta10/Shear15/Restart/Restart/Restart/ 149/200 0.407 0.699 0.620 0.508 0.455 13.3days/0days 441.2gb, 298.9mb, 66.9mb ~8hr/frame, 2048 cores On reservation.
BS, /scratch/madams15/CollidingFlows/Beta10/Shear15/Restart/Restart/Restart/Restart/ 160/200 0.414 0.686 0.595 0.469 0.433 9.2days/0days 535.8gb, 186.1mb, 128.0mb ~3hr/frame, 4096 cores On reservation.
BS, /scratch/madams15/CollidingFlows/Beta10/Shear15/Restart/Restart/Restart/Restart/Restart/ TBA/200 TBA TBA TBA TBA, 2048 cores On reservation.
10 30 BS, /scratch/ekamins2/MHD_Shear/Beta10/Shear30 7/200 0.058 0.467 0.607 0.000 0.000 1 day/0 29.1 gb 81.7 mb, 17.3 mb 2,482,538 7mins/frame, 512 cores Running a bigger box (200,75,75 pc) with same eff. resolution to prevent backflow into colliding flow object. Debug queue, 1 hour, 512 cores
BS, /scratch/ekamins2/MHD_Shear/Beta10/Shear30/ Testing_Smaller_Box 7/200 0.163 0.536 0.607 0.000 0.000 1 day/0 28.5 gb 83.0 mb, 17.5 mb Running original box size (62.5,75,75 pc) with same eff. resolution to see how framerate changes. Debug queue, 1 hour, 512 cores. Same framerate, so making box longer in x seems to not cost much
BS, /scratch/ekamins2/MHD_Shear/Beta10/Shear30/Restart 27/200 0.053 0.515 0.625 0.090 0.204 5.6 days/0 136.5 gb 69.9 mb, 101.8 mb 3,026,761 30mins/frame, 4096 cores Reservation, ekamins2-20140623
BS, /scratch/ekamins2/MHD_Shear/Beta10/Shear30/ Restart/Restart 64/200 0.077 0.494 0.567 0.260 0.362 14 days/0 249 gb 99 mb, 128.0 mb 19,616,658 2hr/frame, 4096 cores Reservation, ekamins2-20140623
BS, /scratch/ekamins2/MHD_Shear/Beta10/Shear30/ Restart/Restart/Restart 78/200 .092 .529 .546 .307 .354 16days/0days 293 gb, 114 mb, 128 mb 27,907,450 3hr/frame on 4096 cores Reservation, ekamins2-20140630
BS, /scratch/ekamins2/MHD_Shear/Beta10/Shear30/ Restart/Restart/Restart/Restart 95/200 .107 .552 .566 .361 .354 11days/0days 360 gb, 130 mb, 128 mb 41,284,985 3hr/frame on 4096 cores Reservation, ekamins2-20140630
BS, /scratch/ekamins2/MHD_Shear/Beta10/Shear30/ Restart/Restart/Restart/Restart/Restart 110/200 .115 .571 .587 .391 .367 10.4days/0days 404 gb, 145 mb, 128 mb 53,451,174 3hr/frame on 4096 cores Reservation, ekamins2-20140630
BS, /scratch/ekamins2/MHD_Shear/Beta10/Shear30/ Restart/Restart/Restart/Restart/Restart 110/200 .115 .571 .587 .391 .367 10.4days/0days 404 gb, 145 mb, 128 mb 54,856,949 3hr/frame on 4096 cores Reservation, ekamins2-20140630
BS, /scratch/ekamins2/MHD_Shear/Beta10/Shear30/Restart/Restart/Restart/Restart/Restart/Restart 114/200 .117 .576 .593 .398 .371 10.9days/0days 415 gb, 147 mb, 128 mb 53,451,174 3hr/frame on 4096 cores Reservation, ekamins2-20140630, 9.6 GB/frame
BS, /scratch/ekamins2/MHD_Shear/Beta10/Shear30/Restart/Restart/Restart/Restart/Restart/Restart/Restart 122/200 0.130 0.587 0.596 0.456 0.412 18.1days/0 363.2 gb, 246.8 mb, 64.0 mb ~5-6hr/frame, 2048 cores
BS, /scratch/ekamins2/MHD_Shear/Beta10/Shear30/Restart/Restart/Restart/Restart/Restart/Restart/Restart/Restart/ 123/200 0.132 0.583 0.595 0.461 0.424 21.1days/0 367.8 gb, 245.9 mb, 64.0 mb ~8hr/frame, 2048 cores Catching chombos for Marissa's runs.
BS, /scratch/madams15/CollidingFlows/Beta10/Shear30 128/200 0.136 0.587 0.601 0.465 0.418 1.2mo/0 377.0 gb, 252.4 mb, 64.0 mb 9hr/frame, 2048 cores Started Jul 24 08:46, cancelled to test other runs, started again at Jul 26 16:41.
BS, /scratch/madams15/CollidingFlows/Beta10/Shear30/Restart/ 143/200 0.131 0.620 0.570 0.392 0.367 7.8days/0days 599.5 gb, 122.6 mb, 256.0 mb ~2-3hours/frame, 8192 cores Started immediately when received word about reservation extension until 08/04 at 08:00:00. Ran on 512 cores, half of BS, for 48 hours.
BS, /scratch/madams15/CollidingFlows/Beta10/Shear30/Restart/Restart/ 153/200 0.141 0.622 0.583 0.442 0.400 9days/0days 504.3 gb, 175.0 mb, 128.mb ~4-5hours/frame, 4069 cores Still on the same reservation. Ran with Beta1Shear60 and Beta10Shear15 (both at 128 cores), while this ran at 256 cores. Stopped eventually when we could no longer write files due to soft limit.
BS, /scratch/madams15/CollidingFlows/Beta10/Shear30/Restart/Restart/Restart 158/200 0.153 0.625 0.595 0.494 0.437 18.2days/0days 437.6gb, 302.6mb, 65.8mb ~8.5hours/frame, 2048 cores On reservation.
BS, /scratch/madams15/CollidingFlows/Beta10/Shear30/Restart/Restart/Restart/Restart 162/200 0.156 0.619 0.601 0.495 0.446 13.6days/0days 445.6gb, 304.2mb, 68.0mb ~8.5hours/frame, 2048 cores On reservation.
BS, /scratch/madams15/CollidingFlows/Beta10/Shear30/Restart/Restart/Restart/Restart/Restart TBA/200 TBA TBA TBA TBA, 2048 cores On reservation.
10 60 BS, /scratch/ekamins2/MHD_Shear/Beta10/Shear60 5/200 0.078 0.522 0.680 0.000 0.000 1.6days/0 37.6 gb 104.3 mb, 22.9 mb 4,180,579 10mins/frame, 512 cores Debug queue, 512 cores, 1 hour, 200x752 pc
BS, /scratch/ekamins2/MHD_Shear/Beta10/Shear60/Restart 91/200 0.124 0.484 0.586 0.158 0.205 7.2 days/0 day 254.7 gb 99.0 mb 11,008,243 1hr/frame, 4096 cores Died with memory error. Reservation ekamins2-20140623
BS, /scratch/ekamins2/MHD_Shear/Beta10/Shear60/Restart/Restart 110/200 0.146 0.525 0.570 0.246 0.240 8.4 days/0 day 241.2 gb 170.2 mb, 64.0 24,927,329 2.5hr/frame, 2048 cores Reservation ekamins2-20140623
BS, /scratch/ekamins2/MHD_Shear/Beta10/Shear60/Restart/Restart/Restart 116/200 0.151 0.527 0.559 0.260 0.250 10.8days/0 248.6 gb, 171.3 mb, 64.0 mb 2048 cores Chombos/sinks moved off of BS. Post-processing is left (07/28).
BS, /scratch/ekamins2/MHD_Shear/Beta10/Shear60/Restart/Restart/Restart/Restart/ 130/200 0.162 0.535 0.569 0.268 0.259 8.7days/0 269.3 gb, 189.6 mb, 64.0 mb 3hr/frame, 2048 cores
BS, /scratch/ekamins2/MHD_Shear/Beta10/Shear60/Restart/Restart/Restart/Restart/Restart/ 133/200 0.164 0.541 0.564 0.282 0.260 7.9days/0 274.9 gb, 189.5 mb, 64.0 mb 3hr/frame, 2048 cores
BS, /scratch/madams15/CollidingFlows/Beta10/Shear60 140/200 0.174 0.547 0.550 0.303 0.251 9.7days/0 288.4 gb, 196.4 mb, 64.0 mb 4hr/frame, 2048 cores 07/24-07/25 Reservation at 128 cores.
BS, /scratch/madams15/CollidingFlows/Beta10/Shear60/Restart/ 151/200 0.180 0.547 0.544 0.318 0.269 6.7days/0 305.3 gb, 211.8 mb, 64.0 mb ~4hr/frame, 2048 cores
BS, /scratch/madams15/CollidingFlows/Beta10/Shear60/Restart/Restart/ 153/200 0.184 0.543 0.545 0.318 0.273 8.6days/0days 309.2 gb, 209.7 mb, 64.0 mb ~4hours/frame, 2048 cores Ran on 128 nodes in the Standard queue. As our reservation took up half of the machine, this run was pending for a while. Error when soft limit hit: 2014-08-02 03:42:44.215 (ERROR) [0x40001069280] ibm.runjob.client.Output: could not write: Disk quota exceeded
BS, /scratch/madams15/CollidingFlows/Beta10/Shear60/Restart/Restart/Restart 159/200 0.188 0.540 0.553 0.326 0.280 7.0days/0days 318.8gb, 217.9mb, 64.0mb 4hours/frame, 2048 cores Running on 128 nodes in the Standard queue. Error found in the astrobear.log file: processor 1339 requesting restart due to nan in flux wl= 0.359789776662163E+01 0.674778789358754E+01 0.657065342093277E+02 0.258750140448991E+00 0.113133666\ 748740E+05 0.731432876571028E+01 0.517505399598208E+02 -0.503691004114713E+01 0.174425741998655E+00 0.353\ 838024064413E+01
BS, /scratch/madams15/CollidingFlows/Beta10/Shear60/Restart/Restart/Restart/Restart/ 163/200 0.190 0.547 0.550 0.327 0.282 7.7days/0days 324.9gb, 218.6mb, 64.0mb 6hours/frame, 2048 cores Running on 128 nodes in the standard queue. No more errors; clean run.
BS, /scratch/madams15/CollidingFlows/Beta10/Shear60/Restart/Restart/Restart/Restart/Restart 167/200 0.192 0.549 0.551 0.333 0.284 15.8days/0days 323.5gb, 222.3mb, 64.0mb 6hours/frame, 2048 cores Running on 128 nodes in the standard queue.
BS, /scratch/madams15/CollidingFlows/Beta10/Shear60/Restart/Restart/Restart/Restart/Restart/Restart TBA/200 TBA TBA TBA TBA 2048 cores Running on 128 nodes in the standard queue. Currently pending.

Attachments (66)

Note: See TracWiki for help on using the wiki.