diff options
| author | Sam Anthony <sam@samanthony.xyz> | 2024-11-04 10:15:35 -0500 |
|---|---|---|
| committer | Sam Anthony <sam@samanthony.xyz> | 2024-11-04 10:15:35 -0500 |
| commit | 23f2d6f002cf48c8f17631f32ca5e46a319e03ce (patch) | |
| tree | 13e5df5fe8d5a3a8f36e8280d556acce5306a389 | |
| parent | b66fe3e88400b20df0280d10eef88b5467fb25f4 (diff) | |
| download | balls-23f2d6f002cf48c8f17631f32ca5e46a319e03ce.zip | |
report
| -rw-r--r-- | report.txt | 63 |
1 files changed, 63 insertions, 0 deletions
diff --git a/report.txt b/report.txt new file mode 100644 index 0000000..1eab1a1 --- /dev/null +++ b/report.txt @@ -0,0 +1,63 @@ +COMP 426 Assignment 3 +Sam Anthony 40271987 + + += Manycore implementation of 2D bouncing balls simulation on the GPU = + + +# Simulation + +This version uses OpenCL to run the simulation on the GPU. The old C +functions were transcribed into three OpenCL kernels: move(), which +updates the positions of the balls; collideWalls(), which reacts to +collisions between balls and the bounds of the screen; and collideBalls(), +which reacts to collisions between pairs of balls. + +The positions, velocities, and radii of the balls are stored in OpenCL +buffers in GPU memory. Vectors are now real float2 vectors as opposed +to structs in prior implementations. + +The move() and collideWalls() kernels run with the global work size set +to the number of balls. Each thread works on one of the balls. + +The collideBalls() kernel relies on the partitioning scheme from the +TBB implementation. The host partitions the set of collisions between +pairs of balls so that dependencies are separated in different cells of +the partition. All collisions within a cell may run in parallel without +synchronization. The partition is represented as a 2D array of pairs of +indices of balls. Each row of the array represents a cell, and each pair +of vertices within a row represents a collision between those two balls. + +The host creates an OpenCL buffer for each cell of the partition and +copies the arrays into GPU memory. To run the collideBalls() kernel, +the host iterates over the partition and sets the kernel argument to the +appropriate buffer containing the cell. The global work size is set to +the size of the cell. Each thread works on one collision between a pair +of balls. + +This version of the program uses a different collision formula based on +the impulse (J) between the two colliding balls. The formula is based +on this page: + +https://introcs.cs.princeton.edu/java/assignments/collisions.html + + +# Graphics + +OpenGL is used to draw the balls on screen. The program leverages +interoperability between OpenCL and OpenGL to minimize data transfer +between host and GPU. + +There are two OpenGL vertex buffer objects that reside on the GPU: +one containing vertices, and one containing colors. There is also an +additional OpenCL buffer instantiated with clCreateFromGLBuffer() which +points to the same data as the GL vertex VBO. + +The genVertices() OpenCL kernel sets the vertex buffer according to the +positions of the balls. The local work size is the number of vertices +per ball (24 currently.) Each thread sets one vertex. There is one +work group per ball. The group id is used to index the position array, +and the global id is used to index the vertex array. + +After the vertices are set, the program uses glDrawArrays() with +GL_TRIANGLE_FAN to draw the balls on-screen. |