wiki:ProgrammingTip20081029

Version 6 (modified by trac, 12 years ago) ( diff )

Setting Up A Clump Perturbation Routine


I have been working on a new routine for randomly placing clumps in a rectangular region (as of 10/29/2008 I only have it working in 2D). Rather than randomly picking points and attempting to place clumps, I arrange the clumps in a rectangular grid within the clump region and then perturb them by random amounts in the x- and y-directions. The y-boundaries of the clump region are periodic, but the x-boundaries are not. In all cases, a center-to-center minimum clump separation is enforced; new perturbation lengths are chosen should a clump or any of its periodic variants fail to meet the minimum-separation constraint.


While the minimum clump separation creates an upper bound on the amount of clump interaction, the actual amount of interaction is driven by the average separation. Thus, we want to keep the average separation as close to the minimum separation as possible, so that we can use MinSeparation to control the amount of clump interaction we see. To do this, we try to fit as many clumps into the clump region as we can, while still enforcing the minimum-separation constraint.


Algorithm

  1. Determine the size of one side of an individual clump's placement boxes (BoxSide == sqrt(ClumpPlacementRegion / NumberOfClumps)).
  1. Determine the number of clump placement boxes in each column and row by dividing the x and y dimensions of the clump region by BoxSide and rounding to the nearest integer.
  1. For each (0 <= i < #rows, 0 <= j < #columns):

3i. Try this process up to MaxRep times, where MaxRep is a number of repetitions read in from clumps.data:

a) Place a clump in the center of box (i,j).

b) Obtain random values for x and y. These values are between ranMin and 1, where ranMin is a minimum value read in from the clumps.data file.

c) Use the random x value from (b) to displace x according to the formula x = x - half (BoxSide - half MinSeparation) + random_x * (BoxSize - half MinSeparation). This formula allows us to move the clump freely within the box while leaving a margin of half the clump separation on each side.

d) Repeat step (c), using y and random_y instead.

e) Check to see if any clumps have x-coordinates that place all or part of the clump outside of the overall clump placement region's x}}-boundaries. If so, then cycle back to {{{3i.

f) Check to see if any clumps have y-coordinates that place all or part of the clump outside the y-bounds of the problem. If so, then create periodic mirrors of those clumps using the formula yperiodic = y +/- (yUpperDomain - yLowerDomain).

g) Check each previously-placed clump AND its periodic mirrors against the current clump and its periodic mirrors. If the MinSeparation is violated, then cycle back to 3i.

3ii. If the current clump cannot be placed after MaxRep tries, then print an error message and kill the program.

3iii. If the number of clumps placed equals the number of clumps specified in clumps.data, then exit the loop.


Comments


As yet, this is a very rudimentary implementation of a perturbation method. Packing problems are still an ongoing field of research, but there is nevertheless a lot of room for refinement in this method. If the clumps are placed in a rectangular grid, then the maximum number of clumps MaxClumps that can be placed in a region of size A should be A / MinSeparation^2. When we randomize the clump placement while still enforcing minimum separation, we reduce the number of clumps that can be placed. In practice, I find that we can get between 60-80 % of MaxClumps, depending on the minimum separation. However, the fewer clumps we place, the more different the average separation tends to be from the minimum separation. Thus, careful choice of nClumps is critical.

Another critical parameter is MinRand. This parameter is read in from clumps.data and controls the minimum possible perturbation; a higher MinRand increases the randomness of the clump distribution, but it also reduces the number of clumps that can be placed, as the greater perturbations cause clumps to impinge on eacho other. Conversely, lower MinRand values make placement easier, at the cost of making the distribution more regular. Furthermore, once MinRand drops below 1 x 10^-10, I find that it ceases to have any effect on the average separation. In practice, I generally find MinRand values between 1 x 10^-4 and 1 x 10^-8 to be the most effective, with a preference for higher numbers whenever possible.


Places for Improvement


  • nClumps: Right now, the maximum number of clumps that get placed is i * j, where i and j. This is not necessarily the same number as nClumps. I'd be interested in a way to make nClumps be the actual number of clumps the algorithm attempts to place, but the naive implementation turns into a greatest common divisor problem, which can be computationally expensive and probably not worth the hassle.
  • reperturbation: This might cut down on the number of unresolvable configurations. If a clump is outside the x-boundaries of the clump region, apply a smaller perturbation to move it back in, rather than just coming up with a new perturbation from scratch. This might also help cure successful distribution patterns' tendency to look a little sparse around the edges.
Note: See TracWiki for help on using the wiki.