report

author: Sam Anthony <sam@samanthony.xyz> 2024-11-04 10:15:35 -0500
committer: Sam Anthony <sam@samanthony.xyz> 2024-11-04 10:15:35 -0500
commit: 23f2d6f002cf48c8f17631f32ca5e46a319e03ce (patch)
tree: 13e5df5fe8d5a3a8f36e8280d556acce5306a389
parent: b66fe3e88400b20df0280d10eef88b5467fb25f4 (diff)
download: balls-23f2d6f002cf48c8f17631f32ca5e46a319e03ce.zip
1 files changed, 63 insertions, 0 deletions
diff --git a/report.txt b/report.txt
new file mode 100644
index 0000000..1eab1a1
--- /dev/null
+++ b/report.txt
@@ -0,0 +1,63 @@
+COMP 426 Assignment 3
+Sam Anthony 40271987
+
+
+= Manycore implementation of 2D bouncing balls simulation on the GPU =
+
+
+# Simulation
+
+This version uses OpenCL to run the simulation on the GPU.  The old C
+functions were transcribed into three OpenCL kernels: move(), which
+updates the positions of the balls; collideWalls(), which reacts to
+collisions between balls and the bounds of the screen; and collideBalls(),
+which reacts to collisions between pairs of balls.
+
+The positions, velocities, and radii of the balls are stored in OpenCL
+buffers in GPU memory.  Vectors are now real float2 vectors as opposed
+to structs in prior implementations.
+
+The move() and collideWalls() kernels run with the global work size set
+to the number of balls.  Each thread works on one of the balls.
+
+The collideBalls() kernel relies on the partitioning scheme from the
+TBB implementation.  The host partitions the set of collisions between
+pairs of balls so that dependencies are separated in different cells of
+the partition.  All collisions within a cell may run in parallel without
+synchronization.  The partition is represented as a 2D array of pairs of
+indices of balls.  Each row of the array represents a cell, and each pair
+of vertices within a row represents a collision between those two balls.
+
+The host creates an OpenCL buffer for each cell of the partition and
+copies the arrays into GPU memory.  To run the collideBalls() kernel,
+the host iterates over the partition and sets the kernel argument to the
+appropriate buffer containing the cell.  The global work size is set to
+the size of the cell.  Each thread works on one collision between a pair
+of balls.
+
+This version of the program uses a different collision formula based on
+the impulse (J) between the two colliding balls.  The formula is based
+on this page:
+
+https://introcs.cs.princeton.edu/java/assignments/collisions.html
+
+
+# Graphics
+
+OpenGL is used to draw the balls on screen.  The program leverages
+interoperability between OpenCL and OpenGL to minimize data transfer
+between host and GPU.
+
+There are two OpenGL vertex buffer objects that reside on the GPU:
+one containing vertices, and one containing colors.  There is also an
+additional OpenCL buffer instantiated with clCreateFromGLBuffer() which
+points to the same data as the GL vertex VBO.
+
+The genVertices() OpenCL kernel sets the vertex buffer according to the
+positions of the balls.  The local work size is the number of vertices
+per ball (24 currently.)  Each thread sets one vertex.  There is one
+work group per ball.  The group id is used to index the position array,
+and the global id is used to index the vertex array.
+
+After the vertices are set, the program uses glDrawArrays() with
+GL_TRIANGLE_FAN to draw the balls on-screen.
author	Sam Anthony <sam@samanthony.xyz>	2024-11-04 10:15:35 -0500
committer	Sam Anthony <sam@samanthony.xyz>	2024-11-04 10:15:35 -0500
commit	23f2d6f002cf48c8f17631f32ca5e46a319e03ce (patch)
tree	13e5df5fe8d5a3a8f36e8280d556acce5306a389
parent	b66fe3e88400b20df0280d10eef88b5467fb25f4 (diff)
download	balls-23f2d6f002cf48c8f17631f32ca5e46a319e03ce.zip