Posts in category stampede

Update with 1024cores

I ran the Beta 10 No Shear case on Stampede with 1024 cores. Here is the result (see Table below):

So if we're at frame 246, we have 154 frames left. So dividing 154 by our rate, we have 9.4 days (225.6 hrs) to run this simulation out. Thus, 225.6 * 1024 = 231,014.4 cpu hrs. Multiply this by 4, as we have 4 runs, yields approximately 924,057.6 cpu hrs total. This is not much different then the total result from last week. It does not seem economical to run these on 1024 cores; in my opinion we might as well just run these on 2048 cores as they'll be faster but have little to no shift in cpu hrs.

Perhaps we should choose just a few cases on 2048 cores?

If on 2048 cores we estimate 34.85 frames a day (average of rates from last blog post) with approximately 164 frames left (average from last blog post) that implies that we have approximately 5 days to run a simulation, or 113 hours. This is approximately 231,304 cpu hrs. With 3 runs, that is 693,911 cpu hrs. With 2 runs that is 462,607 cpu hrs.

Perhaps we could split the runs between machines? However we aimed to use Stampede because it is so fast in comparison to the likes of BlueStreak.

Run (Current Frame) Hours Sampled Time (mins) Avg. time (mins)
b10s0 (246) 23:30 - 00:50 80
10:27 - 11:58 91
19:44 - 21:16 92
87.67

Table. The Beta 10 Shear 0 run with current frame for which the hours were sampled and averaged to do the brief calculations above.

cpu hrs for CF runs on Stampede

Running the CollidingFlows problem out from frame 200 to 400 to double the time and see if we can observe any more sink formation. Given that this run is really computationally intensive, I've done a quick calculation for cpu hrs based on some current runs I am doing on Stampede. All runs are in the normal queue for 24 hrs on 2048 cores. The table below provides the current frame number at which I collected this data. We can see that the average time for our code to spit out a frame is (underscores correspond to the run):

Given that we have 1,440 minutes in a day, implying that we'd spit out the following frames per day:

Considering that the difference between the current frame and the last frame (400) for beta10 shear 0, 15, 30 and 60 respectively are 179, 182, 159, and 136, we're looking at running these out for approximately 5-6 days on 2048 cores. Specifically for b10s0: 5.5 days, b10s15: 5.8 days, b10s30: 5.2 days, and b10s60: 3 days. Using this number of days, that there are 24 hours in a day and we'd run these on 2048 cores, this puts us at a total of: 957,973 cpu hrs. THAT IS INSANE.

After a quick discussion with Erica and Baowei I've come up with the following short term plan: Once these jobs stop later today, I'll submit 1 job to the normal queue on 1,000 cores. For this run I'll make the same calculation and see if it is more economical when multiplied by 4. Baowei has also suggested to throw runs on Gordon, another machine owned by the Texans. We have a lot of SUs there, so he is currently setting me up. We currently only have 1,551,296 SUs available on Stampede — so running our jobs for this problem there could be quite precarious.

Run (Current Frame) Hours Sampled Time (mins) Avg. time (mins)
b10s0 (221) 16:07 - 16:55 48
02:58 - 03:38 40
07:13 - 07:58 45
44.3
b10s15 (218) 18:03 - 18:48 45
02:19 - 03:03 44
11:05 - 11:54 49
46
b10s30 (241) 17:57 - 18:40 43
00:26 - 01:23 57
07:03 - 07:44 41
47
b10s60 (264) 17:40 - 18:07 27
00:04 - 00:38 34
07:43 - 08:18 35
32

Table. Each run with current frame for which the hours were sampled and averaged to do the brief calculations above.