Ming-Lun "Allen" Chou | 周明倫

Game Programming: Time Slicing

Allen Chou — Fri, 28 May 2021 06:18:32 +0000

Source files and future updates are available on Patreon.
You can follow me on Twitter.

This post is part of my Game Programming Series.

Prerequisite

Delayed Result Gathering

Introduction

In the previous tutorial, I used exposure avoidance as an example to demonstrate how to optimize computation with delayed result gathering. The basic idea is: kick jobs to run on worker threads and gather the results later. This prevents the main thread from being stalled.

If the game can afford one-frame latency, then we can kick the job in one frame and gather the results in the next frame. If the game cannot afford such latency, it’s still worth trying kicking the job early and gather the results later in the same frame.

What if the job takes too long to fit in a single frame? Or what if the job is just taking longer than we’d like? Then we can split the work across multiple frames. This is the core idea of time slicing, another optimization technique I picked up at work and one of my longtime favorites.

If you think about it, time slicing is everywhere. Texture streaming, seamless loading, etc., work that happens “in the background” without tanking a game’s frame rate. Can’t do it all in one frame? Then do it across multiple! It’s a simple but very effective idea.

The Exposure Map Example Revisited

Recall our last iteration on the exposure map example. The jobs to set up raycasts, perform raycasts, and gather raycast results are kicked at the end of the update function and waited for completion at the start of the update function in the next frame.

Here are the job structs:

struct RaycastSetupJob : IJobParallelFor
{
  public Vector3 EyePos;
 
  [ReadOnly]
  public NativeArray Grid;
 
  [WriteOnly]
  public NativeArray Commands;
 
  public void Execute(int index)
  {
    Vector3 cellCenter = Grid[index];
    Vector3 vec = cellCenter - EyePos;
    Commands[index] = 
      new RaycastCommand(EyePos, vec.normalized, vec.magnitude);
  }
}
 
struct RaycastGatherJob : IJobParallelFor
{
  [ReadOnly]
  public NativeArray Results;
 
  [WriteOnly]
  public NativeArray ExposureMap;
 
  public void Execute(int index)
  {
    bool exposed = (Results[index].distance <= 0.0f);
    ExposureMap[index] = exposed;
  }
}

Here is the update function:

id UpdateExposureMap()
{
  // wait for jobs from last frame to complete
  hGatherJob.Complete();
 
  // double-buffering
  SwapExposureBackBuffer();
 
  // dispose of job data allocated from last frame
  if (Commands.IsCreated)
    Commands.Dispose();
  if (Results.IsCreated)
    Results.Dispose();
 
  // allocate data shared across jobs
  var allocator = Allocator.TempJob;
  Commands =
    new NativeArray(NumCells, allocator);
  Results =
    new NativeArray(NumCells, allocator);
 
  // create setup job
  var setupJob = new RaycastSetupJob();
  setupJob.EyePos = EyePos;
  setupJob.Grid = Grid;
  setupJob.Commands = Commands;
 
  // create gather job
  var gatherJob = new RaycastGatherJob();
  gatherJob.Results = Results;
  gatherJob.ExposureMap = ExposureMap;
 
  // schedule setup job
  var hSetupJob = setupJob.Schedule(NumCells, JobBatchSize);
 
  // schedule raycast job
  // specify dependency on setup job
  var hRaycastJob = 
    RaycastCommand.ScheduleBatch
    (
      Commands, 
      Results, 
      JobBatchSize, 
      hSetupJob
    );
 
  // schedule gather job
  // specify dependency on raycast job
  hGatherJob = 
    gatherJob.Schedule(NumCells, JobBatchSize, hRaycastJob);
 
  // kick jobs
  JobHandle.ScheduleBatchedJobs();
}

And here is what the profiler shows:

The jobs run outside the span of the update function, freeing up the main thread and not stalling it.

Time Slicing

Now, let’s see how we can further reduce the CPU time spent on the raycast job with time slicing. Here is the pattern I usually use for time slicing:

Declare an index (or iterator) that to keep track of the current progress of time slicing.
Initialize the index at the start of a new batch of work.
Advance the index as additional portion of the work is done in each time slice.
Execute final processing logic when the last portion of the work is done.
Go back to step 2 to start a new batch of work.

Let’s modify our job structs so that they only process a portion of the exposure map in one frame. Here are the changes:

Add a TimeSliceBaseIndex field (initialized to 0) to keep track of time slicing progress.
Add an index field to the job strcuts to mark the beginning of the range of exposure map cells to process in each frame.
Change the WriteOnly attribute on the ExposureMap field of the RaycastGatherJob struct to NativeDisableParallelForRestriction in order to remove Unity’s safeguard that limits the access to native arrays to only the index passed into the jobs’ Execute functions.

So the job structs become:

struct RaycastSetupJob : IJobParallelFor
{
  public Vector3 EyePos;

  [ReadOnly]
  public NativeArray Grid;

  [ReadOnly]
  public NativeArray Commands;

  public int TimeSliceBaseIndex;

  public void Execute(int localIndex)
  {
    int globalIndex = localIndex + TimeSliceBaseIndex;
    Vector3 cellCenter = Grid[globalIndex];
    Vector3 vec = cellCenter - EyePos;
    Commands[localIndex] = 
      new RaycastCommand(EyePos, vec.normalized, vec.magnitude);
  }
}

struct RaycastGatherJob : IJobParallelFor
{
  [ReadOnly]
  public NativeArray Results;

  [NativeDisableParallelForRestriction]
  public NativeArray ExposureMap;

  public int TimeSliceBaseIndex;

  public void Execute(int localIndex)
  {
    int globalIndex = localIndex + TimeSliceBaseIndex;
    bool exposed = (Results[localIndex].distance <= 0.0f);
    ExposureMap[globalIndex] = exposed;
  }
}

And this is the new update function:

void UpdateExposureMap(int numRaysPerTimeSlice)
{
  // wait for jobs from last frame to complete
  hGatherJob.Complete();

  // trim excess ray count for the last time slice of batch
  int numExcessRays = 
    TimeSliceBaseIndex + numRaysPerTimeSlice - NumCells;
  numRaysPerTimeSlice -= Mathf.Max(0, numExcessRays);

  // batch ended?
  if (TimeSliceBaseIndex < 0)
  {
    // double-buffering
    SwapExposureBackBuffer();

    // reset time slicing index
    TimeSliceBaseIndex = 0;
  }

  // dispose of job data allocated from last frame
  if (Commands.IsCreated)
    Commands.Dispose();
  if (Results.IsCreated)
    Results.Dispose();

  // allocate data shared across jobs
  var allocator = Allocator.TempJob;
  Commands =
    new NativeArray
    (
      numRaysPerTimeSlice, 
      allocator
    );
  Results =
    new NativeArray
    (
      numRaysPerTimeSlice, 
      allocator
    );
  
  // create setup job
  var setupJob = new RaycastSetupJob();
  setupJob.EyePos = EyePos;
  setupJob.Grid = Grid;
  setupJob.Commands = Commands;
  setupJob.TimeSliceBaseIndex = TimeSliceBaseIndex;
  
  // create gather job
  var gatherJob = new RaycastGatherJob();
  gatherJob.Results = Results;
  gatherJob.ExposureMap = ExposureMap;
  gatherJob.TimeSliceBaseIndex = TimeSliceBaseIndex;
  
  // schedule setup job
  var hSetupJob = 
    setupJob.Schedule
    (
      numRaysPerTimeSlice, 
      JobBatchSize
    );
  
  // schedule raycast job
  // specify dependency on setup job
  var hRaycastJob = 
    RaycastCommand.ScheduleBatch
    (
      Commands, 
      Results, 
      JobBatchSize, 
      hSetupJob
    );
  
  // schedule gather job
  // specify dependency on raycast job
  hGatherJob = 
    gatherJob.Schedule
    (
      numRaysPerTimeSlice, 
      JobBatchSize, 
      hRaycastJob
    );

  // advance time slice index
  TimeSliceBaseIndex += numRaysPerTimeSlice;

  // end of batch?
  if (TimeSliceBaseIndex >= NumCells)
  {
    // signal end of batch
    TimeSliceBaseIndex = -1;
  }

  // kick jobs
  JobHandle.ScheduleBatchedJobs();
}

If we set the number of rays per time slice to be 10% of the total rays per batch, it will take 10 frames to finish all raycasts for a batch and trigger a buffer swap. In the video below, you can see the front exposure map buffer being refreshed every time all raycasts are finished for a batch. A lower frame rate is used in this video to make the debug draw last longer per frame, so at full frame rate, the latency of updating the exposure map wouldn’t be as long.

If it can be guaranteed that there won’t be any race condition when the gathered raycast results are used to update the exposure map, it is possible to do away with double buffering and directly update a single exposure map buffer as raycast results come in. The end result is that the updated portion of the exposure map is reflected every frame and the overall system is more responsive.

As for performance, here are the stats for running 10k raycasts at different levels of time slicing (all with delayed gathering):

No Time Slicing:
CPU time on main thread: 0.073ms
Average raycast time on each busy CPU core: 0.156ms
Total raycast time on all CPU cores: 1.87ms
Exposure map update latency: 1 frame
50% Rays Per Time Slice:
CPU time on main thread: 0.058ms
Average raycast time on each busy CPU core: 0.093ms
Total raycast time on all CPU cores: 1.11ms
Exposure map update latency: 2 frames
10% Rays Per Time Slice:
CPU time on main thread: 0.044ms
Average raycast time on each busy CPU core: 0.018ms
Total raycast time on all CPU cores: 0.21ms
Exposure map update latency: 10 frames

The CPU time on the main thread reduces slightly as time slicing gets more aggressive. The CPU time on raycasts is roughly proportional to the number of rays per time slice.

Conclusion

I have shown an example of how to use time slicing to split work across multiple frames. With time slicing, there is quite some freedom to adjust the balance between computation time budget and update latency.

If you’re familiar with Unity’s coroutines, you might realize that it’s possible to implement a form of time slicing using coroutines and yielding after a portion of work is finished. That is technically correct, but be aware that Unity’s coroutines only run on the main thread and does not take advantage of worker threads, so they are actually stalling the main thread when running. It’s highly recommended to take advantage of all CPU cores and leave as little work to the main thread as possible.

If you enjoyed this tutorial and would like to see more, please consider supporting me on Patreon. By doing so, you can also get updates on future tutorials. Thanks!

The post Game Programming: Time Slicing first appeared on Ming-Lun "Allen" Chou | 周明倫.

Game Programming: Delayed Result Gathering

Allen Chou — Tue, 18 May 2021 08:39:33 +0000

Source files and future updates are available on Patreon.
You can follow me on Twitter.

This post is part of my Game Programming Series.

本文之中文翻譯在此

Introduction

Back at DigiPen (my college where I learned gamedev), whenever people from the games industry came over on a company day to share their industry experience, one question always came up during Q&A: please tell us one thing you wish you had known at school (it can be anything: technical skills, social skills, production skills, etc.). I often wondered what I would tell the students if I had been at the podium. For a very long time, I had no answers in mind. But now, if I were to present on a company day, I would probably have an answer specifically for CS students. I wish I had known two particular optimization techniques when I was working on my game projects back at school: delayed result gathering and time slicing.

These are the first two techniques I learned since I joined Naughty Dog. And I have found them the most useful throughout my career. I have used them in numerous in-game systems up to this day. They are simple general solutions that can be applied almost everywhere and are usually very effective.

In this tutorial, I will talk about the first of the two: delayed result gathering. I will cover time slicing in the next tutorial.

The Example: Exposure Map & Exposure Avoidance

First, I’ll introduce the example that serves as the testbed for the two optimization techniques. It will be modified throughout these two tutorials to demonstrate how to apply the optimization techniques to an existing system.

The example is exposure map & exposure avoidance. We have a grid that represents whether each cell is exposed to a “sentinel,” and we have a character that tries to move to and hide in a nearby cell that is not exposed. As the sentinel moves around, the character runs away once their current cell becomes exposed. Meet shy ball.

Here are the steps within the update loop for this mechanic:

Update the exposure map based on the sentinel’s current position.
Cast a ray from the sentinel to each cell. If the ray is clear, the cell is exposed.
Compute the path lengths from the character to each grid cell.
If the character’s current cell is exposed, update their destination to a nearby cell that is not exposed, and navigate the character to the destination.

And this is what the update loop looks like in code:

void Update()
{
  UpdateExposureMap();
  UpdatePathLengths();
  UpdateDestination();
}

If I were still in college, my implementation of UpdateExposureMap probably would have looked like this:

void UpdateExposureMap()
{
  for (int iCell = 0; iCell < NumCells; ++iCell)
  {
    Vector3 cellCenter = Grid[iCell];
    Vector3 vec = cellCenter - EyePos;
    Vector3 rayDir = vec.normalized;
    float rayLen = vec.magnitude;

    bool exposed = !Physics.Raycast(EyePos, rayDir, rayLen);
    ExposureMap[iCell] = exposed;
  }
}

During the entirely of this tutorial and the next one, I will solely focus on optimizing step 1, i.e. UpdateExposureMap, so the implementation details of steps 2 & 3, UpdatePathLengths and UpdateDesination, are omitted. Note that the optimization techniques are not about optimizing specific algorithms (the raycasts themselves in this case) at a lower level, but optimization how they are called at a higher level. Also, depending on the use case, raycasts might not be the best way to update exposure maps and are used here solely for demonstration purposes.

For the implementation above, this is what the profiler shows for 10k raycasts per frame:

When zoomed in further, we can see that the wide orange rectangle is actually tons of tiny raycast sections, one for each raycast.

An immediate problem arises: raycasts in large quantities are not exactly cheap. Making these function calls and waiting for them to return sequentially stalls the main thread pretty badly. Optimization is in order.

Multi-Threading

A series of blocking raycast function calls on the main thread is pretty bad. Raycasts belong to the family of embarrassingly parallelizable operations. Each raycast does the same thing: it queries the spatial data and finds where the ray hits collision, and this operation can be self-contained on a separate thread. Instead of filling up the main thread with sequential raycast function calls, the first thing to try is offloading the work to other CPU cores, i.e. taking advantage of multi-threading. If we have 10 worker cores, then as a first pass, we can try splitting the raycast work into 10 threads.

If you’re using Unity, there is luckily an official system to make it easier to take advantage of multi-threading: Unity’s job system. There is even an official job class specifically for multi-threaded raycasts! You set up the raycasts, tell the job how many raycasts to run on each thread, kick the job, and you’re done! The raycasts will be executed on multiple CPU cores, and all that’s left to do is wait for the job to finish and gather the results.

We define two job structs: RaycastSetupJob and RaycastGatherJob. The setup job sets up the raycast commands before the raycast job that does the actual raycast work (kicked by calling Unity’s RaycastCommand.ScheduleBatch). The gather job gathers raycast results and use them to update the exposure map. Here’s the dependency chain: The raycast job depends on the setup job, and the gahter job depends on the raycast job.

struct RaycastSetupJob : IJobParallelFor
{
  public Vector3 EyePos;

  [ReadOnly]
  public NativeArray Grid;

  [WriteOnly]
  public NativeArray Commands;

  public void Execute(int index)
  {
    Vector3 cellCenter = Grid[index];
    Vector3 vec = cellCenter - EyePos;
    Commands[index] = 
      new RaycastCommand(EyePos, vec.normalized, vec.magnitude);
  }
}

struct RaycastGatherJob : IJobParallelFor
{
  [ReadOnly]
  public NativeArray Results;

  [WriteOnly]
  public NativeArray ExposureMap;

  public void Execute(int index)
  {
    bool exposed = (Results[index].distance <= 0.0f);
    ExposureMap[index] = exposed;
  }
}

And the UpdateExposureMap function becomes:

void UpdateExposureMap()
{
  // allocate data shared across jobs
  var allocator = Allocator.TempJob;
  var commands = 
    new NativeArray(NumCells, allocator);
  var results = 
    new NativeArray(NumCells, allocator);

  // create setup job
  var setupJob = new RaycastSetupJob();
  setupJob.EyePos = EyePos;
  setupJob.Grid = Grid;
  setupJob.Commands = commands;

  // create gather job
  var gatherJob = new RaycastGatherJob();
  gatherJob.Results = results;
  gatherJob.ExposureMap = ExposureMap;

  // schedule setup job
  var hSetupJob = setupJob.Schedule(NumCells, JobBatchSize);

  // schedule raycast job
  // specify dependency on setup job
  var hRaycastJob = 
    RaycastCommand.ScheduleBatch
    (
      commands, 
      results, 
      JobBatchSize, 
      hSetupJob
    );

  // schedule gather job
  // specify dependency on raycast job
  var hGatherJob = 
    gatherJob.Schedule(NumCells, JobBatchSize, hRaycastJob);

  // kick jobs
  JobHandle.ScheduleBatchedJobs();

  // wait for completion
  hGatherJob.Complete();

  // dispose of job data
  commands.Dispose();
  results.Dispose();
}

Now this is what the profiler shows:

The CPU time taken up by the raycast step on the main thread is reduced. However, the main thread is still stalled while waiting for the multi-threaded raycast job on the worker threads to complete. Can we further improve this?

Delayed Result Gathering

Within the first few days I started working at Naughty Dog, I immediately found myself needing to touch a system that involved raycasts. One of the more senior programmers walked me through the process, and gave an offhand tip: “Oh, by the way, you don’t want to cast the rays and just have the thread wait there. You’ll want to cast the rays this frame, and delay the gathering of the results until the next frame.” That lit up a huge lightbulb in my head. Of course! If the system can afford a one-frame latency, then why not kick the job and gather the results the next frame? This way, after kicking the job, the CPU core can be immediately freed up to work on something else.

Note that if the results are needed within the same frame, it is still worth attempting to kick the job early, perform some other work, and gather the results later in the same frame. The point is to avoid stalling the main thread while waiting for job completion.

In next iteration of our example, I’m going to make the main thread kick the raycast job in one frame, and delay the gathering of the raycast results until the next frame. There is now effectively nothing left to take up CPU time on the main thread except for the gathering of raycast results from the previous frame and kicking the job that sets up the raycast job for the next frame.

The gather job is now waited for completion at the start of the UpdateExposureMap function. The exposure map is now double-buffered, so the jobs can work on a back buffer, while the rest of the code can access the front buffer to avoid race conditions. Also, the gather job handle and shared data across jobs are now promoted from local variables to fields, because their lifetime needs to span two frames.

void UpdateExposureMap()
{
  // wait for jobs from last frame to complete
  hGatherJob.Complete();

  // double-buffering
  SwapExposureBackBuffer();

  // dispose of job data allocated from last frame
  if (Commands.IsCreated)
    Commands.Dispose();
  if (Results.IsCreated)
    Results.Dispose();

  // allocate data shared across jobs
  var allocator = Allocator.TempJob;
  Commands =
    new NativeArray(NumCells, allocator);
  Results =
    new NativeArray(NumCells, allocator);

  // create setup job
  var setupJob = new RaycastSetupJob();
  setupJob.EyePos = EyePos;
  setupJob.Grid = Grid;
  setupJob.Commands = Commands;

  // create gather job
  var gatherJob = new RaycastGatherJob();
  gatherJob.Results = Results;
  gatherJob.ExposureMap = ExposureMap;

  // schedule setup job
  var hSetupJob = setupJob.Schedule(NumCells, JobBatchSize);

  // schedule raycast job
  // specify dependency on setup job
  var hRaycastJob = 
    RaycastCommand.ScheduleBatch
    (
      Commands, 
      Results, 
      JobBatchSize, 
      hSetupJob
    );

  // schedule gather job
  // specify dependency on raycast job
  hGatherJob = 
    gatherJob.Schedule(NumCells, JobBatchSize, hRaycastJob);

  // kick jobs
  JobHandle.ScheduleBatchedJobs();
}

And this is what the profiler shows now:

The raycast job is running on the worker threads outside of the span of UpdateExposureMap. The results are not needed until the call to UpdateExposureMap in the next frame. No more stalling the main thread!

Conclusion

I have shown the evolution of the raycast example in three iterations:

Raycast functions are called sequentially and gravely stalls the main thread.
Raycasts are offloaded to other worker threads but are still waited for on the main thread. This still stalls the main thread to some extent.
The main thread is used only for the delayed gathering of raycast results from the job kicked in the previous frame, as well as kicking the job that sets up the raycast job for the next frame.

Here are some numbers I gathered from profiling the 3 iterations of the example. My CPU is an Intel i7-8700K. The timing was measured for 10k raycasts per frame.

Iteration 1 (single-threaded)
CPU time on main thread: 4.09ms
Average raycast time on each busy CPU core: 1.24ms
Total raycast time on all CPU cores: 1.24ms
Iteration 2 (multi-threaded jobs, same-function result gathering)
CPU time on main thread: 0.47ms
Average raycast time on each busy CPU core: 0.16ms
Total raycast time on all CPU cores: 1.96ms
Iteration 3 (multi-threaded jobs, delayed result gathering)
CPU time on main thread: 0.073ms
Average raycast time on each busy CPU core: 0.16ms
Total raycast time on all CPU cores: 1.87ms

Even though the total raycast time on all CPU cores is the lowest for iteration 1, the overhead of individual raycast function calls results in a total of 4.09ms spent on the main thread. Plus, iteration 1 puts all the work through a single CPU core, leaving the other cores idle. For iterations 2 & 3, the total raycast time of approximately 1.87-1.96ms on all CPU cores is higher than the 1.24ms from iteration 1, likely due to overhead introduced by the job system. But that is okay. The most important metric is the average time on each busy CPU core.

The average and total CPU time for iterations 2 & 3 are almost identical as expected, as their only difference is the timing of result gathering. Iteration 2 already reduces the CPU time spent on the main thread from 4.09ms to 0.47ms. In iteration 3, the time is further lowered to 0.073ms. This is a big improvement from iteration 1 (4.09ms -> 0.073ms). CPU cores are better utilized, and the time spent on each busy core for raycasts is much lower than iteration 1 (1.24ms -> 0.16ms).

I have covered delayed result gathering. In the next tutorial, I will go over the other very useful optimization technique: time slicing, which can further optimize the last iteration of our example. Stay tuned!

If you enjoyed this tutorial and would like to see more, please consider supporting me on Patreon. By doing so, you can also get updates on future tutorials. Thanks!

The post Game Programming: Delayed Result Gathering first appeared on Ming-Lun "Allen" Chou | 周明倫.

Game Math: Dot Product, Rulers, And Bouncing Balls

Allen Chou — Sat, 18 Jan 2020 07:34:05 +0000

Source files and future updates are available on Patreon.
You can follow me on Twitter.

This post is part of my Game Math Series.

本文之中文翻譯在此

Prerequisites

Overview

The dot product is a simple yet extremely useful mathematical tool. It encodes the relationship between two vectors’ magnitudes and directions into a single value. It is useful for computing projection, reflection, lighting, and so much more.

In this tutorial, you’ll learn:

The geometric meaning of the dot product.
How to project one vector onto another.
How to measure an object’s dimension along an arbitrary ruler axis.

How to reflect a vector relative to a plane.
How to bounce a ball off a slope.

The Dot Product

Let’s say we have two vectors, and . Since a vector consists of just a direction and a magnitude (length), it doesn’t matter where we place it in a figure. Let’s position and so that they start at the same point:

The dot product is a mathematical operation that takes two vectors as input and returns a scalar value as output. It is the product of the signed magnitude of the first vector’s projection onto the second vector and the magnitude of the second vector. Think of projection as casting shadows using parallel light in the direction perpendicular to the vector being projected onto:

We write the dot product of and as (read a dot b).

If the angle between the two vectors is less than 90 degrees , the signed magnitude of the first vector is positive (thus simply the magnitude of the first vector). If the angle is larger than 90 degrees, the signed magnitude of the first vector is its negated magnitude.

Which one of the vectors is “the first vector” doesn’t matter. Reversing the vector order gives the same result:

If is a unit vector, the signed magnitude of the projection of onto is simply .

Cosine-Based Dot Product Formula

Notice that there’s a right triangle in the figure. Let the angle between and be :

Recall from this tutorial that the length of the adjacent side of a right triangle is the length of its hypotenuse multiplied by the cosine of the angle , so the signed magnitude of the projection of onto is :

So the dot product of two vectors can be expressed as the product of each vector’s magnitude and the cosine of the angle between the two, which also reaffirms the property that the order of the vectors doesn’t matter:

If both and are unit vectors, then simply equals to .

If the two vectors are perpendicular (angle in between is ), the dot product is zero. If the angle between the two vectors is smaller than , the dot product is positive. If the angle is larger than , the dot product is negative. Thus, we can use the sign of the dot product of two vectors to get a very rough sense of how aligned their directions are.

Since monotonically decreases all the way to , the more similar the directions of the two vectors are, the larger their dot product; the more opposite the directions of the two vectors are, the smaller their dot product. In the extreme cases where the two vectors point in the exact same direction () and the exact opposite directions (), their dot products are and , respectively.

Component-Based Dot Product Formula

When we have two 3D vectors as triplets of floats, it isn’t immediately clear what the angle in between them are. Luckily, there’s an alternate way to compute the dot product of two vectors that doesn’t involve taking the cosine of the angle in between. Let’s denote the components of and as follows:

Then the dot product of the two vectors is also equal to the sum of component-wise products, and can be written as:

Simple, and no cosine needed!

Unity provides a function Vector3.Dot for computing the dot product of two vectors:

float dotProduct = Vector3.Dot(a, b);

Here is an implementation of the function:

Vector3 Dot(Vector3 a, Vector b)
{
  return a.x * b.x + a.y * b.y + a.z * b.z;
}

The formula for computing a vector’s magnitude is and can also be expressed using the dot product of the vector with itself:

Recall the formula . This means if we know the dot product and the magnitudes of two vectors, we can reverse-calculate the angle between them by using the arccosine function:

If and are unit vectors, we can further simplify the formulas above by skipping the computation of vector magnitudes:

Vector Projection

Now that we know the geometric meaning of the dot product as the product of a projected vector’s signed magnitude and another vector’s magnitude, let’s see how we can project one vector onto another. Let denote the projection of onto :

The unit vector in the direction of is , so if we scale it by the signed magnitude of the projection of onto , then we will get . In other words, is parallel to the direction of and has a magnitude equal to that of the projection of onto .

Since the dot product is the product of the magnitude of and the signed magnitude of the projection of onto , the signed magnitude of is just the dot product of and divided by the magnitude of :

Multiplying this signed magnitude with the unit vector gives us the formula for vector projection:

Recall that , so we can also write the projection formula as:

And if , the vector to project onto, is a unit vector, the projection formula can be further simplified:

Unity provides a function Vector3.Project that computes the projection of one vector onto another:

Vector3 projection = Vector3.Project(vec, onto);

Here is an implementation of the function:

Vector3 Project(Vector3 vec, Vector3 onto)
{
  float numerator = Vector3.Dot(vec, onto);
  float denominator = Vector3.Dot(onto, onto);
  return (numerator / denominator) * onto;
}

Sometimes we need to guard against a potential degenerate case, where the vector being projected onto is a zero vector or a vector with an overly small magnitude, producing a numerical explosion as the projection involves division by zero or near-zero. This can happen with Unity’s Vector3.Project function.

One way to handle this is to compute the magnitude of the vector being projected onto. Then, if the magnitude is too small, use a fallback vector (e.g. the unit +X vector, the forward vector of a character, etc.):

Vector3 SafeProject(Vector3 vec, Vector3 onto, Vector3 fallback)
{
  float sqrMag = v.sqrMagnitude;
  
  if (sqrMag > Epsilon) // test against a small number
    return Vector3.Project(vec, onto);
  else
    return Vector3.Project(vec, fallback);
}

Exercise: Ruler

Here’s an exercise for vector projection: make a ruler that measures an object’s dimension along an arbitrary axis.

A ruler is represented by a base position (a point) and an axis (a unit vector):

struct Ruler
{
  Vector3 Base;
  Vector3 Axis;
}

Here’s how you project a point onto the ruler. First, find the relative vector from the ruler’s base position to the point. Next, project this relative vector onto the ruler’s axis. Finally, the point’s projection is the ruler’s base position offset by the projected relative vector:

Vector3 Project(Vector3 vec, Ruler ruler)
{
  // compute relative vector
  Vector3 relative = vec - ruler.Base;
  
  // projection
  float relativeDot = Vector3.Dot(vec, ruler.Axis);
  Vector3 projectedRelative = relativeDot * ruler.Axis;

  // offset from base
  Vector3 result = ruler.Base+ projectedRelative;

  return result;
}

The intermediate relativeDot value above basically measures how far away the point’s projection is from the ruler’s base position, in the direction of the ruler’s axis if positive, or in the opposite direction of the ruler’s axis if negative.

If we compute such measurement for each vertex of an object’s mesh and find the minimum and maximum measurements, then we can obtain the object’s dimension measured along the ruler’s axis by subtracting the minimum from the maximum. Offsetting from the ruler’s base position by the ruler’s axis vector multiplied by these two extreme values gives us the two ends of the projection of the object onto the ruler.

void Measure
(
  Mesh mesh, 
  Ruler ruler, 
  out float dimension, 
  out Vector3 minPoint, 
  out Vector3 maxPoint
)
{
  float min = float.MaxValue;
  float max = float.MinValue;

  foreach (Vector3 vert in mesh.vertices)
  {
    Vector3 relative = vert- ruler.Base;
    float relativeDot = Vector3.Dot(relative , ruler.Axis);
    min = Mathf.Min(min, relativeDot);
    max = Mathf.Max(max, relativeDot);
  }
  
  dimension = max - min;
  minPoint = ruler.Base+ min * ruler.Axis;
  maxPoint = ruler.Base+ max * ruler.Axis;
}

Vector Reflection

Now we are going to take a look at how to reflect a vector, denoted , relative to a plane with its normal vector denoted :

We can decompose the vector to be reflected into a parallel component (denoted ) and a perpendicular component (denoted ) with respect to the plane:

The perpendicular component is the projection of the vector onto the plane’s normal, and the parallel component can be obtained by subtracting the perpendicular component from the vector:

Flipping the direction of the perpendicular component and adding it to the parallel component gives us the reflected vector off the plane.

Let’s denote the reflection ):

If we substitute with , we get an alternative formula:

Unity provides a function Vector3.Reflect for computing vector reflection:

float reflection = Vector3.Reflect(vec, normal);

Here is an implementation of the function using the first reflection formula:

Vector3 Reflect(Vector vec, Vector normal)
{
  Vector3 perpendicular= Vector3.Project(vec, normal);
  Vector3 parallel = vec - perpendicular;
  return parallel - perpendicular;
}

And here is an implementation using the alternative formula:

Vector3 Reflect(Vector vec, Vector normal)
{
  return vec - 2.0f * Vector3.Project(vec, normal);
}

Exercise: Bouncing A Ball Off A Slope

Now that we know how to reflect a vector relative to a plane, we are well-equipped to simulate a ball bouncing off a slope.

We are going to use the Euler Method mentioned in a previous tutorial to simulate the trajectory of a ball under the influence of gravity.

ballVelocity+= gravity * deltaTime;
ballCenter += ballVelocity* deltaTime;

In order to detect when the ball hits the slope, we need to know how to detect when a ball penetrates a plane.

A sphere can be defined by a center and a radius. A plane can be defined by a normal vector and a point on the plane. Let’s denote the sphere’s center , the sphere radius , the plane normal (a unit vector), and a point on the plane . Also, let the vector from to be denoted .

If the sphere does not penetrate the plane, the component of perpendicular to the plane, denoted , should be in the same direction as and have a magnitude no less than .

In other words, the sphere does not penetrate the plane if ; otherwise, the sphere is penetrating the plane by the amount and its position needs to be corrected.

In order to correct a penetrating sphere’s position, we can simply move the sphere in the direction of the plane’s normal by the penetration amount. This is an approximated solution and not physically correct, but it’s good enough for this exercise.

// returns original sphere center if not penetrating
// or corrected sphere center if penetrating
void SphereVsPlane
(
  Vector3 c,        // sphere center
  float r,          // sphere radius
  Vector3 n,        // plane normal (unit vector)
  Vector3 p,        // point on plane
  out Vector3 cNew, // sphere center output
)
{
  // original sphere position as default result
  cNew = c;

  Vector3 u = c - p;
  float d = Vector3.Dot(u, n);
  float penetration = r - d;

  // penetrating?
  if (penetration > 0.0f)
  {
    cNew = c + penetration * n;
  }
}

And then we insert the positional correction logic after the integration.

ballVelocity += gravity * deltaTime;
ballCenter += ballVelocity* deltaTime;

Vector3 newSpherePosition;
SphereVsPlane
(
  ballCenter, 
  ballRadius, 
  planeNormal, 
  pointOnPlane, 
  out newBallPosition
);

ballPosition = newBallPosition;

We also need to reflect the sphere’s velocity relative to the slope upon positional correction due to penetration, so it bounces off correctly.

The animation above shows a perfect reflection and doesn’t seem natural. We’d normally expect some sort of degradation in the bounced ball’s velocity, so it bounces less with each bounce.

This is typically modeled as a restitution value between the two colliding objects. With 100% restitution, the ball would bounce off the slope with perfect velocity reflection. With 50% restitution, the magnitude of the ball’s velocity component perpendicular to the slope would be cut in half. The restitution value is the ratio of magnitudes of the ball’s perpendicular velocity components after versus before the bounce. Here is a revised vector reflection function with restitution taken into account:

Vector3 Reflect
(
  Vector3 vec, 
  Vector3 normal, 
  float restitution
)
{
  Vector3 perpendicular= Vector3.Project(vec, normal);
  Vector3 parallel = vec - perpendicular;
  return parallel - restitution * perpendicular;
}

Here is the modified SphereVsPlane function that takes variable restitution into account:

// returns original sphere center if not penetrating
// or corrected sphere center if penetrating
void SphereVsPlane
(
  Vector3 c,        // sphere center
  float r,          // sphere radius
  Vector3 v,        // sphere velocity
  Vector3 n,        // plane normal (unit vector)
  Vector3 p,        // point on plane
  float e,          // restitution
  out Vector3 cNew, // sphere center output
  out Vector3 vNew  // sphere velocity output
)
{
  // original sphere position & velocity as default result
  cNew = c;
  vNew = v;

  Vector3 u = c - p;
  float d = Vector3.Dot(u, n);
  float penetration = r - d;

  // penetrating?
  if (penetration > 0.0f)
  {
    cNew = c + penetration * n;
    vNew = Reflect(v, n, e);
  }
}

And the positional correction logic is replaced with a complete bounce logic:

ballVelocity+= gravity * deltaTime;
spherePosition += ballVelocity* deltaTime;

Vector3 newSpherePosition;
Vector3 newSphereVelocity;
SphereVsPlane
(
  spherePosition , 
  ballRadius, 
  ballVelocity, 
  planeNormal, 
  pointOnPlane, 
  restitution, 
  out newBallPosition, 
  out newBallVelocity;
);

ballPosition= newBallPosition;
ballVelocity= newBallVelocity;

Finally, now we can have balls with different restitution values against a slope:

Summary

In this tutorial, we have been introduced to the geometric meaning of the dot product and its formulas (cosine-based and component-based).

We have also seen how to use the dot product to project vectors, and how to use vector projection to measure objects along an arbitrary ruler axis.

Finally, we have learned how to use the dot product to reflect vectors, and how to use vector reflection to simulate balls bouncing off a slope.

If you enjoyed this tutorial and would like to see more, please consider supporting me on Patreon. By doing so, you can also get updates on future tutorials. Thanks!

The post Game Math: Dot Product, Rulers, And Bouncing Balls first appeared on Ming-Lun "Allen" Chou | 周明倫.

Game Math: Inverse Trigonometric Functions, Slope Angles, And Facing Objects

Allen Chou — Tue, 22 Oct 2019 05:28:02 +0000

Source files and future updates are available on Patreon.
You can follow me on Twitter.

This post is part of my Game Math Series.

本文之中文翻譯在此

Prerequisites

Overview

At this point, we have learned about the three basic trigonometric functions: sine, cosine, and tangent. Now, we are going to take a look at their inverse functions, as well as how they can be utilized in games.

In this tutorial, you’ll learn:

The inverse functions of the three basic trigonometric functions.
How to compute the angle of a slope given a desired slope value.

The domains and ranges of inverse trigonometric functions.
The special convenience inverse trigonometric function atan2.
How to make an object face towards the mouse cursor.

Inverse Functions

A function can be treated like a black box that takes some input and gives you some output. If a function takes an input and spits out an output , we can write it as (read y equals f of x). Meanwhile, if a function can take an output of and give back what input takes that could produce such output, we say that such function is the inverse of , and we write it as (read f inverse).

In other words, if the function takes and gives you , which can be written as , then can take as input and give you , which can be written as .

An example of a function verses its inverse is a function that adds one to its input and a function that subtracts one from its input. Let denote the function that adds one to , and denote the one that subracts one from . If we feed into , we get:

Now, if we feed back into , we get our original input back:

Inverse Trigonometric Functions

We already know that trigonometric functions take an angle as input and produce a number as output. We can feed the output of a trigonometric function (a real number) into its inverse function, and the inverse function would spit out the original input to the trigonometric function (an angle in radians). For example, , and .

Inverse trigonometric functions have special names. Rather than “sine inverse”, the inverse of sine, written as , is called arcsine. Similarly, and are called arccosine and arctangent, respectively. In Unity, here’s how you’d call these three inverse trigonometric functions:

float sinAngle = Mathf.Asin(sinValue); // arcsine
float cosAngle = Mathf.Acos(cosValue); // arccosine
float tanAngle = Mathf.Atan(tanValue); // arctangent

Slope Angles

As a quick example, if we know the ratio of vertical rise versus horizontal offset of a hill in a game level, how do we compute the angle of the slope? Using the illustration below, how do we compute from the vertical rise and horizontal offset ?

The goal is to express using and . First, we can relate to and using the tangent function:

Next, we can obtain by feeding into :

Alternatively, we can view the equation above as the result of taking the arctangent of both sides of the previous equation. Generally, cancels out and gives you ; similarly, cancels out and gives you .

The angle is in radians. As mentioned in an earlier tutorial, we can convert the angle’s unit to degrees by multiplying it with .

So, we can make a little interactive program that allows the user to move a point that forms a slope with the origin, and use the point’s coordinates to compute and display the slope angle.

And here’s the code:

Vector3 point = p.transform.position;

// compute slope angle in radians
float angleRad = Mathf.Atan(point.y / point.x);

// convert to degrees
// Mathf.Rad2Deg is a constant equal to 180.0f / Pi
float angleDeg = angleRad* Mathf.Rad2Deg;

text = angleDeg + &amp;amp;quot;°&amp;amp;quot;;

Domains And Ranges

When using inverse trigonometric functions, it’s important to understand their domains and ranges.

The domain of a function is the collection of all valid values as input, and the range of a function is the collection of all possible output values.

For example, the domain of is the collection of all real numbers, because you can pass any angle to it as input. And the range of is , which is a notation for the collection including all values between and including -1 and 1. If a parenthesis is used instead of a bracket, it means that side of the boundary is not included in the collection; for example, denotes a collection including all values between 0 and 10, but only including the boundary 0 and not the boundary 10.

The inverse of a function should simply have a domain and range equal to the range and domain, respectively, of the corresponding function, right? For inverse trigonometric functions, that’s not the case.

Trigonometric functions are periodic, which means multiple different input values can result in the same output value. For and , they even can have different input values within a single period resulting in the same output value.

Let’s use as an example again. Both and give the same value 1. So what is the output of ? It can’t be simultaneously equal to , , or other inputs that make equal to 1. In fact, the ranges of inverse trigonometric functions are chosen to be of limited range, commonly agreed upon and universally used.

Since the ranges of and are both , the domains of and are both as well. The ranges of and are chosen to be and , respectively. These ranges cover an angle range of radians, or 180 degrees.

Hence, is equal to , which is the one and only input that makes equal to 1 and lies within the range .

As for , since the range of is the collection of all real numbers, the domain of is the collection of all real numbers as well. And the range of is chosen to be , same as that of .

The Atan2 Convenience Function

Lets say we have a point in 2D, and it is in the first quadrant, i.e. and . Let be the angle from the axis to the line segment connecting the origin and .

We know that , so we can compute from the coordinates of using the arctangent function: . Since both and are positive, would lie within , encompassed within the full range of arctangent, which is .

This this is what the computation looks like in code:

float angle = Mathf.Atan(p.y / p.x);

What if is in the fourth quadrant, i.e. and ? would become negative and would output a negative angle within , which is also encompassed within the full range of arctangent, .

Problems arise when we have in the second or third quadrant. If is in the second quadrant, i.e. and , the fraction is negative. We can find a point in the second quadrant that results in a ratio equal to a ratio from a point in the fourth quadrant. One such point pair are those that satisfy .

The two points and in the figure below have identical coordinate ratios .

Also seen in the figure above is that the coordinate ratios of points and , when compared to the coordinate ratios of points in the first quadrant and in the third quadrant, only differ in signs (negative instead of positive). All the absolute sharp angles (angles less than 90 degrees) between the line segments connecting the origin & the points and the X axis are identical.

The ratio is equal to the ratio , which is in turn equal to , because the two negative signs cancel out. So, if we pass as input to the arctangent function, actually gives you the same negative angle as , because an angle in the fourth quadrant is within the range of arctangent, but an angle in the second quadrant is not.

When we pass in to the arctangent function, what we really want to get is the green positive astute angle (angle larger than 90 grees) shown in the figure below, not the red negative sharp ones. We always want to start measuring angles from the +X direction.

In order to do so, before combining and into a ratio and passing it to the arctangent function, we check the signs of and first to see which quadrant the point is in. And if we get an angle outside the range , we fix up the output of the arctangent function to get the output angle in the correct quadrant. Here’s the code that does this fix-up:

// range of this function is (-pi, pi]
float FixedUpAtan(float py, float px)
{
  if (px > 0.0f) // normal, no fix-up needed
  {
    // &amp;amp;quot;normal&amp;amp;quot;
    // py > 0.0f : first quadrant
    // py < 0.0f : fourth quadrant
    return Mathf.Atan(py / px);
  }
  else if (px < 0.0f) // fix-up needed
  {
    if (py > 0.0f) // second quadrant
      return Math.PI + Mathf.Atan(py / px);
    else if (py < 0.0f) // third quadrant
      return -Math.PI + Mathf.Atan(py / px);
    else // angle on negative X axis
      return 2.0f * Mathf.PI;
  }
  else // infinity
  {
    if (py > 0.0f)
      return 0.5f * Mathf.PI; // ratio is positive infinity
    else if (py < 0.0f)
      return -0.5f * Mathf.PI; // ratio is negative infinity
    else
      return 0.0f; // degenerate input (the origin)
  }
}

That seems like quite a lot of work. Luckily, almost all standard math libraries in any programming languages provide a convenience function called atan2, which has a full 360-degree range of and does exactly what the code above does (most likely in a more efficient and optimized fashion). Note that the argument order is Y first and X second. Atan2 in different libraries may have different ordering of the two arguments, but based on what I’ve seen, Y followed by X is pretty common.

I often see a misconception that atan2 is just an alternative to the arctangent function and doesn’t do anything extra that arctangent cannot do. This is actually incorrect. The arctangent function only takes a single value as input, and its output range is . On the other hand, atan2 takes two values as input ( and before they are combined into a single ratio), and the output has a full 360-degree range of .

Facing An Object Towards The Mouse Cursor in 3D

Lastly, let’s look at a classic example of facing an object towards the mouse cursor.

First, find the intersection between the ray under the mouse cursor and the ground plane. Then, place an object at that intersection, creating the effect of the object following the mouse cursor in 3D. This object is our look target.

Camera cam = Camera.current;
Vector3 mouse= Input.mousePosition;
Ray ray = cam.ScreenPointToRay(mouse);
float rayDist;
plane.Raycast(ray, out rayDist);
sphere.position = ray.GetPoint(rayDist);

Next, let’s use our old friend UFO Bunny from Boing Kit again. When un-rotated, her forward vector is in the +X direction, and her left vector is in the +Z direction. We want to face her towards the look target.

Then, let UFO Bunny be the origin, and calculate the coordinates of the look target relative to her:

Vector3 coord =  
  sphere.transform.position 
  - ufoBunny.transform.position;

Now, let’s mark up the scene with an angle between the X axis and the line segment connecting UFO Bunny and the look tartget:

As shown before, the angle can be calculated from the convenient atan2 function:

float thetaRad = Mathf.atan2(coord.z, coord.x); // in radians

Recall this figure:

This figure shows the XY plane, and as increases, rotates counterclockwise around the origin. The rotation axis of such rotation is the +Z axis (later tutorials will explain this in more details). The UFO Bunny and the look target lie on the XZ plane; to translate the figure on the XY plane to the XZ plane, we map the +X axis to the +X axis, the +Y axis to the +Z axis, and the rotation axis of the +Z axis to the -Y axis.

Now that we have the rotation axis and the desired rotation angle, we can finally construct a quaternion representing such rotation. Quaternions will also be covered in later tutorials. For now, we just need to know that quaternion is a type of data Unity uses to represent object rotation.

float thetaDeg = thetaRad * Mathf.Rad2Deg; // in degrees
float axis = Vector3.down; // (0, -1, 0) == -Y axis
Quaternion rot = Quaternion.AngleAxis(thetaDeg, axis);
ufoBunny.transform.rotation = rot;

And here’s our final result:

Note: Unity already provides helper functions like Quaternion.LookRotation and Transform.LookAt that can achieve the same effect. But the purpose of this tutorial is to help understand inverse trigonometric functions.

Summary

In this tutorial, we have been introduced to the inverse trigonometric functions, how they relate to their corresponding trigonometric functions, and their domains and ranges.

Also, we have seen that the arctangent function doesn’t have a full 360-degree range, but a convenient utility function atan2 does.

Lastly, we have learned how to use the atan2 function to implement the classic example of facing an object towards the mouse cursor.

If you enjoyed this tutorial and would like to see more, please consider supporting me on Patreon. By doing so, you can also get updates on future tutorials. Thanks!

The post Game Math: Inverse Trigonometric Functions, Slope Angles, And Facing Objects first appeared on Ming-Lun "Allen" Chou | 周明倫.

Game Math: Trigonometry Basics – Tangent, Triangles, And Cannonballs

Allen Chou — Sat, 31 Aug 2019 07:17:20 +0000

Source files and future updates are available on Patreon.
You can follow me on Twitter.

This post is part of my Game Math Series.

本文之中文翻譯在此

Prerequisite

Trignometry Basics – Sine & Cosine

Overview

In the previous tutorial, we have learned about two basic trigonometric functions: sine & cosine. This time, we are going to look at another basic trigonometric function: tangent. Together, these three functions form the basis of trigonometry, and they can be used to solve all sorts of geometric problems that arise in game development.

In this tutorial, you’ll learn:

A geometric interpretation of another basic trigonometric function: tangent.
The relationships among sine, cosine, and tangent.
How to use tangent to create smooth intro and outro motion.

How to relate angles and sides of right triangles using trigonometric functions.
How to simulate a cannonball, given an initial speed and an elevation angle.
How to draw predicted trajectories even before firing the cannonball.

How to place cannonball targets, given a horizontal distance and an elevation angle.

Geometric Interpretation of Tangent

Let’s look at the unit circle from the last tutorial, with a point on it, as well as the angle between the X axis (+X direction) and the line segment formed by and the origin.

Recall that the coordinates of is . This time we are going to look at a new trigonometric function: (tangent of theta). It is the slope of the line segment between and the origin.

The slope of a line is its ratio of vertical change versus horizontal change. For example, let’s look at this line segment:

To move from point to point , we walk 3 units in the +X direction and then 2 units in the +Y direction, so the slop of the line is .

And for a line segment that goes “downhill” like this:

The slope would be , a negative value, since the vertical change versus horizontal change is negative.

Now, back to the unit circle figure:

We see that moving from the origin to involves a horizontal change of and a vertical change of , so the slope of the line segment between the origin and is , hence .

But that’s just a mathematical expression. Here’s where visually fits into the unit circle figure. Let’s draw a tangential line to the circle at , i.e. a line that goes through and is perpendicular to the line segment between and the origin:

Let’s just look at the portion of this tangential line that is between and the X axis, and mark up some of the points:

The angles and are right angles. And . Let denote the length of the line segment between point and point .

Now, split the figure into two triangles:

Since all internal angles of a triangle add up to and both triangles have an angle of and , the unmarked angles from both triangles, and , are exactly the same: .

If two triangles have identical sets of angles, then they are similar, i.e. if you proportionally scale, rotate, and/or flip one of them, it can become identical to the other one.

When two triangles are similar, the ratio between the lengths of two sides from one triangle equals to the ratio between the lengths of the corresponding sides of the other triangle. Thus:

We know that the coordinates of are , so and . And we know that is equal to the radius of the unit circle, so . Now the equation above becomes:

And we know that , so we get:

We have found the visual representation of !

The absolute value of is the length of the portion of the tangential line between point and the X axis. Notice that I said absolute value, because depending on the signs of and , can be positive or negative. This animation highlights (in blue) the line segment whose length is equal to the absolute value of :

The Tangent Curve

We’ve seen the plots for and versus in the previous tutorial. Let’s overlay them on top of each other:

And now let’s add into the mix:

Notice how, unlike and , the value of is not constrained within the range. Since , the absolute value of approaches infinity as approaches zero. Also, unlike and , the period of is , instead of .

Another thing worth noting is the relationships among the signs of the three basic trigonometric functions. Since , the sign of is positive when and have the same sign, and is negative otherwise.

Now, let’s try plugging the tangent curve over time into the X coordinate of an object:

float tan = Mathf.Tan(Rate * Time.time);
obj.transform.position = Vector3(tan, 0.0f, 0.0f);

The object comes in fast from the direction, slows down a bit, and then runs off fast again towards the direction.

We can utilize this motion to create effects like these falling stars:

float tan = Mathf.Tan(Rate * Time.time);
obj.transform.position = center + moveDirection * tan;

The acceleration and deceleration are kind of subtle. We can further amplify the effect by raising the tangent function to a power of, say, 3:

float tan = Mathf.Tan(Rate * Time.time);
float tan3 = tan * tan * tan;
obj.transform.position = center + moveDirection * tan3;

Trigonometric Functions, Angles, And Triangles

So we’ve seen how the three basic trigonometric functions relate to the unit circle. Now we’re going to take a look at their relationships with triangles. They are called trigonometric functions, after all. Specifically, we’re going to look at right triangles (triangles with a right angle).

First, let’s get the terminologies out of the way. Here is a right triangle with an angle marked up as :

The side of the triangle between and the right angle is called the adjacent side, since it is adjacent to . The other side next to the right angle is called the opposite side, because it is across from . The remaining (also the longest) side opposed to the right angle is called the hypotenuse:

And here is how the three basic trigonometric functions relate to the lengths of the triangle sides:

length of the opposite side divided by length of the hypotenuse.
length of the adjacent side divided by length of the hypotenuse.
length of the opposite side divided by length of the adjacent side.

Or, in mathematical form:

These equations could be a bit too much to remember. Here’s a common verbal mnemonic that might help: soh-cah-toa (sine is the opposite side divided by the hypotenuse, cosine is the adjacent side divided by the hypotenuse, and tangent is the opposite side divided by the adjacent side).

I did not learn this verbal mnemonic in Taiwan (my math classes were taught in Mandarin). What I learned was a visual mnemonic that I’m quite fond of: Write the initials of sine, cosine, and tangent in cursive, along with the right triangle as shown below (please forgive my ugly handwriting).

When you write an initial, the corresponding function equals the length of the first side you write past dividing the length of the second side you write past:

length of the hypotenuse dividing length of the opposite side.
length of the hypotenuse dividing length of the adjacent side.
length of the adjacent side dividing length of the opposite side.

When describing fractions in Mandarin, instead of saying “A divided by B”, we say “B dividing A”. That’s why this mnemonic orders the divisor before the dividend in its wording. This ordering might not be intuitive to native English speakers, but if you find it useful, then great!

Now back to the equations:

Whatever the size of the right triangle, the equations above always hold true, because ratios between two sides are independent of the absolute lengths of individual sides.

If we scale the triangle so that the hypotenuse is of length 1, then we can fit it back into our unit circle figure, with being the coordinates of point on the circle:

And the equations above agree nicely with the coordinates of , :

Knowing the equations for trigonometric functions in terms of lengths of right triangle sides, for any given right triangle with an angle , if we know the length of any one side, we can derive the lengths of the other two sides using the three basic trigonometric functions.

Let denote the length of the adjacent side, the length of the opposite side, and the length of the hypotenuse:

If we know the length of the hypotenuse (), then , and :

If we know the length of the adjacent side (), then , and :

If we know the length of the opposite side (), then , and :

Simulating Cannonballs & Predicting Trajectories

Finally, it’s time for practical examples! Let’s see how we can simulate cannonballs when given an initial speed, a horizontal angle, and an elevation angle. Also, let’s find out how we can display the predicted trajectories even before firing the cannon.

But before all that, here’s a very quick recap on some basic terminologies in motion dynamics. An object’s position is where the object is physically located. An object’s velocity is the rate of change in its position (typically expressed as change of position per second). An object’s acceleration is the rate of change in its velocity (typically expressed as change of velocity per second).

The Euler Method is a quick and easy algorithm for simulating object movement: For each moving object, we store its velocity vector along with its position. For each update, or time step, we change the velocity by acceleration times delta time (the time difference between each update), and then we change the position by velocity times delta time:

velocity += acceleration * deltaTime;
position += velocity * deltaTime;

To simulate gravity at ground level and at human scale, we let the acceleration be a constant downward-pointing vector. Here’s an example of how an object would move in 2D under the influence of gravity when starting off with an initial velocity pointing up and to the right, simulated using the Euler Method:

If we simulate the entire trajectory within a single frame by performing multiple time steps, and draw a little dot once every several iterations, we can get ourselves a nice indicator of the predicted trajectory:

velocity = initialVelocity;
position = initialPosition;
for (int i = 0; i < NumIterations; ++i)
{
  velocity += acceleration * deltaTime;
  position += velocity * deltaTime;
  
  if (i % IterationsPerDot != 0)
    continue;
  
  DrawDot(position);
}

Now, let’s compute the initial velocity of a cannonball if it is fired from the cannon at an initial speed (length of the initial velocity vector), a horizontal angle , and an elevation angle (phi). Let the direction be the cannon’s forward direction and the direction be its right direction (Unity uses left-hand coordinates).

To compute the initial velocity, we need to first compute a unit vector (vector of length 1) in the same direction. Once we have that unit vector, we can simply multiply all its components by a the desired speed to obtain the initial velocity vector.

The diagram below shows the a unit vector in the direction in red, a unit vector in the direction in green, a unit vector in the direction in blue, a unit vector in the direction of initial velocity in black (labeled ), a unit vector in the horizontal direction of the initial velocity in gray (labeled ), the horizontal angle (between and ), and the elevation angle (between and ):

The goal is to find and multiply it with . We can isolate the unit vectors and angles from the diagram above into two unit circle diagrams.

One is a horizontal unit circle diagram with , , , and :

And the other one is a vertical unit circle diagram with , , , and :

If we view the first (horizontal) unit circle diagram from a different angle, we’ll get a familiar view of a flat unit circle:

We’ve done this math before. The component of in the drection of is of length , and the component in the direction of is of length . This gives us .

Now, view the second (vertical) unit circle diagram from a different angle that gives us the same familiar view of a flat unit circle:

It’s the same drill. The component of in the direction of is of length , and the component in the direction of is of length , so we can now compute :

Multiplying with gives us our initial velocity vector:

And the corresponding code is:

Vector3 ComputeInitialVelocity()
{
  float sinTheta = Mathf.Sin(HorizontalAngle);
  float cosTheta = Mathf.Cos(HorizontalAngle);
  float sinPhi = Mathf.Sin(ElevationAngle);
  float cosPhi = Mathf.Cos(ElevationAngle);

  return
    InitialSpeed
    * new Vector3
      (
        cosPhi * sinTheta, 
        sinPhi, 
        cosPhi * cosTheta
      );
}

Being able to compute the initial velocity vector from a given initial speed, horizontal angle, and elevation angle, we are now well-equipped to simulate a cannonball:

void FireCannon()
{
  velocity = ComputeInitialVelocity();
  obj.transform.position = InitialPosition;
}

void Update()
{
  float dt = Time.deltaTime;
  velocity += acceleration * dt;
  obj.transform.position += velocity * dt;
}

void DrawTrajectory()
{
  float dt = Time.fixedDeltaTime;
  Vector3 velocity = ComputeInitialVelocity();
  Vector3 position = InitialPosition;
  for (int i = 0; i < NumIterations; ++i)
  {
    velocity += acceleration * dt;
    position += velocity * dt;
    
    if (i % IterationsPerDot != 0)
      continue;
    
    DrawDot(position);
}

Placing Cannonball Targets

Now that we can fire cannonballs, let’s place some targets. If we want to place a target at a given horizontal distance away from the cannon, as well as at a given elevation angle, where exactly should we place the targets?

Below is the desired end result. Each target is at a fixed horizontal distance (on the XZ plane) away from the cannon, and is at a fixed elevation angle above ground. The targets are also equally spaced out horizontally, i.e. their horizontal angles relative to the cannon are equally spaced out.

We already know how to compute a horizontal unit vector from a horizontal angle . The horizontal unit vector is . Multiplying such horizontal vector with a given horizontal distance, denoted , gives us the horizontal offset vector of the target from the cannon: . Equally spacing out different and computing the horizontal offset vector for each value gives us the XZ coordinates of the targets (shown as red dots in the image below):

The last step is to determine the Y coordinates of the targets, i.e. how far off ground the targets should be. Recall that if we know the length of the adjacent side to an angle of a right triangle to be , then the length of the opposite side is .

Substituting with the given horizontal distance and with the elevation angle , the formula for the Y coordinate of the targets becomes .

We can finally place our targets at the desired positions:

float theta = -0.5f * AngleInterval * (NumTargets - 1);
float elevationTan = Mathf.Tan(ElevationAngle);

foreach (var target in targetArray)
{
  Vector3 horizontalVec = 
    HorizontalDistance 
    * new Vector3
      (
        Mathf.Sin(theta), 
        0.0f, 
        Mathf.Cos(theta)
      );

  theta += AngleInterval;

  Vector3 verticalVec = 
    HorizontalDistance 
    * elevationTan 
    * Vector3.up;

  target.transform.position = 
    Cannon.position 
    + horizontalVec 
    + verticalVec;
}

We haven’t talked about how to detect when a cannonball hits a target or the ground yet. Right now the cannonballs would just go through the targets:

Collision detection is beyond the scope of this tutorial, so I’ll just go over the very basics of sphere-sphere collision really quick.

To detect when a cannonball hits the target, check the distance between the centers of the two and see if it’s less than the sum of their radii. If the cannonball does collide with a target, we destroy the cannonball and the target.

Vector3 cannonballToTargetVec = 
  target.transform.position 
  - cannonball.transform.position;

float cannonballToTargetDist = cannonballToTargetVec.magnitude;
float radiusSum = cannonballRadius + targetRadius;
if (cannonballToTargetDist < radiusSum)
{
  DestroyCannonball();
  DestroyTarget();
}

Using a similar technique when drawing the predicted trajectory, we can terminate the trajectory early when it hits a target.

However, this collision detection technique is discrete, meaning that the cannonball can still go through targets if it travels fast enough. We can mitigate this problem with a continuous collision detection technique, but that is also beyond the scope of this tutorial and will be touched on in later tutorials.

Summary

Previously, we have been introduced to two basic trigonometric functions: sine and cosine. In this tutorial, we have seen a geometric interpretation of another trigonometric function: tangent. We have also learned the relationship among sine, cosine, and tangent, in the context of the unit circle, as well as right triangles.

Next, we have plotted the tangent function alongside sine and cosine; and we are now able to create smooth into and outro motion by utilizing the tangent function.

Finally, using the three basic trigonometric functions, we have learned how to predict and simulate the trajectory of a cannonball, given an initial speed and elevation angle. Plus, we have seen how to place targets, given a horizontal distance and an elevation angle.

We have learned the basics of the three fundamental trigonometric functions that are essential in solving daily gamedev problems. In later tutorials, I will go over more useful mathematical tools that are built on top of these trigonometric functions, as well as some of their practical applications.

If you enjoyed this tutorial and would like to see more, please consider supporting me on Patreon. By doing so, you can also get updates on future tutorials. Thanks!

The post Game Math: Trigonometry Basics – Tangent, Triangles, And Cannonballs first appeared on Ming-Lun "Allen" Chou | 周明倫.

Game Math: Trigonometry Basics – Sine & Cosine

Allen Chou — Mon, 26 Aug 2019 04:08:55 +0000

Source files and future updates are available on Patreon.
You can follow me on Twitter.

This post is part of my Game Math Series.

本文之中文翻譯在此

Overview

Trigonometry is a very essential building block to a huge portion of game math. That’s why I’ve chosen this topic for the first tutorial of my new Gamedev Tutorials series. Having a solid understanding of basic of trigonometry can go a long way for game development. It is used extensively in game problem solving.

In this tutorial you’ll learn:

A geometric interpretation of two basic trigonometric functions: sine & cosine.
The comparison of two different angle units: degrees & radians.
Some basic properties of sine & cosine.
How to move and arrange things in a circular fashion:

How to move things in a spiral fashion:

How to create simple harmonic motion:

How to create damped spring motion:

How to create pendulum motion:

How to generate hovering motion:

Geometric Interpretation of Sine & Cosine

Let’s look at the unit circle, a circle with a radius of 1 centered at the origin.

Now pick a point on the circle. The line segment between this point and the origin forms an angle (theta) between it and the X axis (positive X direction).

What’s shown here is actually a way to geometrically express the 2 basic trigonometric functions: (sine of theta) and (cosine of theta). The coordinates of this point are exactly .

So, to recap, the two trigonometry function and are, respectively, the Y and X coordinates of a point on the unit circle, where is the angle from the X axis (positive X direction) to the line segment between the point and the origin.

Since and are functions, the proper notation should include parenthesis around the input: and , but many people and literature just ignore the parenthesis and write them as and .

and are functions that take one single input (an angle) and output a single value between -1 and 1. If you think about it, this output range makes sense: the X and Y coordinates of a point on the of the unit circle can never go outside of the range.

If the angle increases at a constant rate, we can plot the value of the point’s X and Y coordinates individually over time.

If we compare the plots side-by-side in the form of angle-vs-value, we can see they are the same periodic curve in a wave-like shape, but offset by a fourth from each other.

The period of these functions is , so gives the same value as . This makes sense, because rotating an extra past would bring us back to the same angle.

Degrees v.s. Radians

The angle passed into the trigonometric functions can be in two different units: degrees and radians. Most people are familiar with degrees and its upper-little-circle notation. For instance, the right angle (90 degrees) is written as . Do beware that is not the same as . If the degree notation is not present, the angle’s unit is actually regarded as radians.

is equivalent to (pi) radians, where is the famous mathematical constant, “the ratio of a circle’s circumference to its diameter”, approximately equal to 3.14. Hence, an angle of 1 radian is approximately , almost . As a sanity check, entering on an engineering calculator (with angle units set to radians) will give us approximately , which is indeed close to .

Here are some common degree-to-radian mappings:

In Unity, the and functions are called via Mathf.Sin and Mathf.Cos, respectively. Beware that these functions take input in radians, so if you want to compute , don’t write:

// this is actually cosine of 45 radians!
float cos45Deg = Mathf.Cos(45.0f);

45 radians is about . A full revolution of is equivalent to no rotation at all. divided by gives us a remainder of , which is the equivalent angle of and is different from .

To compute , write this instead:

// covert to radians
float cos45Deg = Mathf.Cos(45.0f * Mathf.PI / 180.0f);

Or use constants that help convert between degrees and radians.

float cos45Deg = Mathf.Cos(45.0f * Mathf.Deg2Rad);

In tools like the Unity editor, expressing angles in degrees is more user friendly, because most people can immediately picture what a angle looks like. However, in the context of math and programming, many people, myself included, prefer sticking with radians.

One useful thing about radians is that it trivializes calculating arc length from a given radius and angle. Let’s say we want to calculate the length of an arc of , or radians, from a circle of radius 2.

If computed using degrees, first the whole circumference is calculated using the formula , and then it’s multiplied by the ratio of out of :

When using radians, the arc length formula is simply radius times angle in radians:

The circle’s circumference formula agrees nicely with the arc length formula in radians. Since one full circle is basically an arc with an angle of radians, the length of such arc is , exactly the same as the circle’s circumference formula.

Basic Properties of Sine & Cosine

Now let’s look at some basic properties of sine & cosine that can come in handy in future mathematical derivations.

Since are coordinates of a point on the unit circle, the point’s distance from the origin is always 1, regardless of the angle . The Pythagorean theorem states that the distance of the point from the origin is . From there we can get this identity (equation that is always true):

The squares of and are written as and , respectively. People write them that way probably because they are too lazy to write and .

Recall the side-by-side comparison of the sine and cosine plots.

You can see that the cosine curve basically is the sine curve shifted to the left by , or radians. This means we can get these identities that convert between the two:

Moving in Circles & Spirals

Now that we’ve seen that are 2D coordinates of a point on the unit circle, we can start playing with some basic circular motion in Unity.

The code below moves an object around a circle at a constant rate:

obj.transform.position = 
  new Vector3
  (
    Radius * Mathf.Cos(Rate * Time.time), 
    Radius * Mathf.Sin(Rate * Time.time), 
    0.0f
  );

The code below moves 12 objects around a circle at a constant rate, and the objects are equally spaced out around the circle:

float baseAngle = Rate * Time.time + angleOffset;
for (int i = 0; i < 12; ++i)
{
  float angleOffset = 2.0f * Mathf.PI * i / 12.0f;
  aObj[i].transform.position = 
    new Vector3
    (
      Radius * Mathf.Cos(baseAngle + angleOffset), 
      Radius * Mathf.Sin(baseAngle + angleOffset), 
      0.0f
    );
}

Combining circular motion with movement in the Z direction, we can create a spiral motion in 3D:

obj.transform.position = 
  new Vector3
  (
    Radius * Mathf.Cos(Rate * Time.time),
    Radius * Mathf.Sin(Rate * Time.time),
    ZSpeed * Time.time
  );

Simple Harmonic Motion (S.H.M.)

We’ve seen this plot of cosine versus angle:

What if we plug cosine into the X coordinate of an object?

float x = Mathf.Cos(Rate * Time.time);
obj.transform.position = Vector3(x, 0.0f, 0.0f);

This is what we get:

This kind of oscillating motion that matches a sine-shaped curve, a.k.a. sinusoid, is known as simple harmonic motion, or S.H.M.

Since starts at zero, the object’s X coordinate starts at . If we use , the X coordinate would start at .

The input angle passed in to the sine and cosine functions are called the phase. Typically, if the phase passed in is a constant multiple of time, many people write it as , where (omega) is called the angular frequency (in radians per second), and is the time. For example, would produce a simple harmonic motion that oscillates one full cycle every second.

What if we scale this motion by an exponentially decreasing factor?

float s = Mathf.Pow(0.5f, Decay * Time.time);
float x = Mathf.Cos(Rate * Time.time);
obj.transform.position = Vector3(s * x, 0.0f, 0.0f);

Now the object moves in a damped spring motion:

Pendulum Motion

Instead of plugging a sinusoid into an object’s X coordinate, what if we plug it into the angle for the circular motion example above?

float baseAngle = 1.5f * Mathf.PI; // 270 degrees
float halfAngleRange = 0.25f * mathf.PI; // 45 degrees
float c = Mathf.Cos(Rate * Time.time);
float angle = halfAngleRange * c + baseAngle;
obj.transform.position = 
  new Vector3
  (
    Radius * Mathf.Cos(angle), 
    Radius * Mathf.Sin(angle), 
    0.0f
  );

The object now moves in a pendulum motion:

We can treat this as the circular motion’s angle being in a simple harmonic motion.

Hovering Motion

As a bonus example, here is UFO Bunny, a character from Boing Kit, my bouncy VFX extension for Unity.

We can apply staggered simple harmonic motion to her X, Y, and Z coordinates separately.

Vector3 hover = 
  new Vector3
  (
    RadiusX * Mathf.Sin(RateX * Time.time + OffsetX), 
    RadiusY * Mathf.Sin(RateY * Time.time + OffsetY), 
    RadiusZ * Mathf.Sin(RateZ * Time.time + OffsetZ)
  );

obj.transform.position = basePosition + hover;

And this creates a hovering motion.

And the hover offset can be used to compute a tilt rotation. This is beyond the scope of this tutorial, so I’ll just leave the code and results here.

obj.transform.rotation = 
  baseRotation 
  * Quaternion.FromToRotation
    (
      Vector3.up, 
      -hover + 3.0f * Vector3.up
    );

Summary

That’s it!

We have seen how and can be geometrically defined as coordinates of a point on the unit circle.

Also, we have seen the difference between the two angle units: degrees and radians.

Finally, we now know how to moves things in circles and spirals, as well as oscillating things in simple harmonic motion, damped spring motion, pendulum motion, and hove motion.

I hope this tutorial has helped you get a better understanding of the 2 basic trigonometric functions: sine & cosine.

In the next tutorial, I will introduce one additional basic trigonometric function: tangent, as well as talk about more applications of all these 3 functions.

Until then!

If you enjoyed this tutorial and would like to see more, please consider supporting me on Patreon. By doing so, you can also get updates on future tutorials. Thanks!

The post Game Math: Trigonometry Basics – Sine & Cosine first appeared on Ming-Lun "Allen" Chou | 周明倫.

Readable & Debuggable Multi-Condition Game Code

Allen Chou — Sat, 21 Jul 2018 04:23:36 +0000

This post is part of my Game Programming Series.

Over time, I have adopted several coding patterns for writing readable and debuggable multi-condition game code, which I systematically follow.

By multi-condition code, I’m talking about code where certain final logic is executed, after multiple condition tests have successfully passed; also, if a test fails, further tests will be skipped for efficiency, effectively achieving short-circuit evaluation.

NOTE: I only mean to use this post to share some coding patterns I find useful, and I do NOT intend to suggest nor claim that they are “the only correct ways” to write code. Also, please note that it’s not all black and white. I don’t advocate unconditionally adhering to these patterns; I sometimes mix them with opposite patterns if it makes more sense and makes code more readable.

Shortcuts:
Early Outs
Debuggable Conditions
Debug Draw Locality
Forcing All Debug Draws

Early Outs

There are two ends of spectrum regarding this subject, early outs versus single point of return. Both camps have pretty valid arguments, but I lean more towards the early-out camp. The early-out style is the foundation of all the patterns presented in this post.

As an example, consider the scenario where we need to test if a character is facing the right direction, if the weapon is ready, and if the path is clear, before finally executing an attack.

This is one way to write it:

FacingData facingData = PrepareFacingData();
if (TestFacing(facingData))
{
  WeaponData weaponData = PrepareWeaponData();
  if (TestWeaponReady(weaponData))
  {
    PathData pathData = PareparePathData();
    if (TestPathClear(pathData))
    {
      Attack();
    }
  }
}

The code block above is written in the so-called single-point-of-return style. The logic flow is straightforward and always ends at the bottom of the code block. If the code block is wrapped inside a function, the function will have one single point of return, i.e. at the bottom of the function:

void TryAttack()
{
  FacingData facingData = PrepareFacingData();
  if (TestFacing(facingData))
  {
    WeaponData weaponData = PrepareWeaponData();
    if (TestWeaponReady(weaponData))
    {
      PathData pathData = PareparePathData();
      if (TestPathClear(pathData))
      {
        Attack();
      }
    }
  }
  // return here
}

The same logic in early-out style would look something like this, returning right where the first test fails:

void TryAttack()
{
  FacingData facingData = PrepareFacingData();
  if (!TestFacing(facingData))
    return;

  WeaponData weaponData = PrepareWeaponData();
  if (!TestWeaponReady(weaponData))
    return;

  PathData pathData = PareparePathData();
  if (!TestPathClear(pathData))
    return;

  Attack();
}

I like this style better, because the intent to get out of the function right after the first failing test is very clear. Also, as the number of tests get large, it avoids excessive indentation, which many jokingly call “indent hadouken.”

PHP Streetfighter #php pic.twitter.com/5dQ5H1UrB2

Paul Dragoonis (@dr4goonis) June 11, 2014

Early outs don’t just apply to functions. They apply to loops as well. For example, this is how I would iterate through characters and collect those who can attack:

for (Characer &c : characters)
{
  FacingData facingData = ParepareFacingData(c);
  if (!TestFacing(facingData))
    continue;

  WeaponData weaponData = PrepareWeaponData(c);
  if (!TestWeaponReady(weaponData))
    continue;

  PathData pathData = PareparePathData(c);
  if (!TestPathClear(pathData))
    continue;

  charactersWhoCanAttack.Add(c);
}

One special case where early out is not quite possible without refactoring, is that if there’s more code that must always be executed after the condition tests.

In the single-point-of-return style, it may look like this:

void TryAttack()
{
  FacingData facingData = PrepareFacingData();
  if (TestFacing(facingData))
  {
    WeaponData weaponData = PrepareWeaponData();
    if (TestWeaponReady(weaponData))
    {
      PathData pathData = PareparePathData();
      if (TestPathClear(pathData))
      {
        Attack();
      }
    }
  }

  // always execute
  PostAttackTry();
}

But using the early-out style, we’d have a problem:

void TryAttack()
{
  FacingData facingData = PrepareFacingData();
  if (!TestFacing(facingData))
    return;

  WeaponData weaponData = PrepareWeaponData();
  if (!TestWeaponReady(weaponData))
    return;

  PathData pathData = PareparePathData();
  if (!TestPathClear(pathData))
    return;

  Attack();

  // Uh, oh. Not always executed!
  PostAttackTry();
}

If you’re comfortable with using forward-only gotos to jump to the PostAttackTry call and your team’s coding standards allow it, then you’re set. Otherwise, we need to keep looking for solutions.

What about calling PostAttackTry wherever the function returns?

void TryAttack()
{
  FacingData facingData = PrepareFacingData();
  if (!TestFacing(facingData))
  {
    PostAttackTry();
    return;
  }

  WeaponData weaponData = PrepareWeaponData();
  if (!TestWeaponReady(weaponData))
  {
    PostAttackTry();
    return;
  }

  PathData pathData = PareparePathData();
  if (!TestPathClear(pathData))
  {
    PostAttackTry();
    return;
  }

  Attack();

  PostAttackTry();
}

This is also not good. The moment someone adds a new return while forgetting to also add a call to PostAttackTry, the logic breaks.

In this case, if the logic is trivial, I’d be okay with just using the single-point-of-return style. Otherwise, I would refactor the tests into a separate function, while maintaining the early-out style:

bool CanAttack()
{
  FacingData facingData = PrepareFacingData();
  if (!TestFacing(facingData))
    return false;

  WeaponData weaponData = PrepareWeaponData();
  if (!TestWeaponReady(weaponData))
    return false;

  PathData pathData = PareparePathData();
  if (!TestPathClear(pathData))
    return false;

  return true;
}

void TryAttack()
{
  if (CanAttack())
    Attack();

  PostAttackTry();
}

[EDIT]
I’ve received feedback proposing the use of destructors of some helper class to invoke the final logic, a la scope-based resource management (RAII).

For me, the acceptable instances that rely on destructors like this are scoped-based resource management, profiler instrumentation, and whatever logic that comes in the form of a tightly coupled “entry” and “exit” logic pairs, but not this.

In my acceptable cases, an exit logic is ensured to always accompany its corresponding entry logic, taking burden off programmers by preventing them from accidentally exiting a scope without executing the exit logic.

I think relying on destructors to execute arbitrary logic upon scope exit that is not coupled with an entry logic induces unnecessary complexity and risk of omittance. When reading the code, instead of making a mental note of “something is executed here, and the accompanying exiting logic will be executed upon scope exit”, the readers now have to remember that “nothing is done yet, and only when the scope is exited will something happen.” To me, the latter is a mental burden that is more likely to be omitted, because it’s not tied to any concrete logic execution right at the code location where the helper struct is constructed.
[/EDIT]

Debuggable Conditions

Continuing using the character attack example from above, let’s consider the scenario where each test function now returns a results struct that contains a success flag indicating whether the test has passed, as well as extra info gathered from the test that is useful for debugging purposes.

This is what the code might look like:

void TryAttack()
{
  FacingData facingData = PrepareFacingData();
  if (!TestFacing(facingData).IsFacingValid())
    return;

  WeaponData weaponData = PrepareWeaponData();
  if (!TestWeaponReady(weaponData).IsWeaponReady())
    return;

  PathData pathData = PareparePathData();
  if (!TestPathClear(pathData).IsPathClear())
    return;

  Attack();
}

It looks good and all, but there’s one specific issue that technically doesn’t lie in the code itself, but it affects the programmer’s experience when debugging this piece of code.

Embedding the test function’s return value within if conditions like this, means that if we want to set a break point and peek inside the extra info in the results structs, we’d have to step into the individual tests, step out of the tests, and then look at the returned value.

Visual Studio supports this feature inside the autos window:

I find it slightly annoying to have to step in and then step out of test functions, just to inspect their results results structs. So, I normally assign the return values to local variables, and then perform if checks on those variables instead.

This way, I can just step over the test function calls and inspect the results structs, without having to step in and out of the test functions:

void TryAttack()
{
  FacingData facingData = PrepareFacingData();
  FacingResults facingResults = TestFacing(facingData);
  if (!facingResults.IsFacingValid())
    return;

  WeaponData weaponData = PrepareWeaponData();
  WeaponResults weaponResults = TestWeaponReady(weaponData);
  if (!weaponResults.IsWeaponReady())
    return;

  PathData pathData = PareparePathData();
  PathResults pathResults = TestPathClear(pathData);
  if (!pathResults.IsPathClear())
    return;

  Attack();
}

Even if we’ve stepped past some tests, extra info regarding the tests is still available in the locals window, which is a convenience I now cannot live without:

As a side note, if the test is just a trivial expression, I think embedding it in the if condition is totally fine, and it can actually make the code cleaner.

Debug Draw Locality

NOTE: For simplicity’s sake, code mechanisms to effectively strip debug draws in release build is omitted in this post (e.g. #ifdefs, macros, flags, etc.).

Sometimes we need to debug draw based on test results, to show why a test succeeded or failed. That’s what the results structs returned from test functions are for.

I like to keep debug draw code close to the related condition tests, if not inside the test function themselves. In my opinion, this makes the code cleaner and easier to read in independent chunks.

It’s perfectly fine if leaving certain debug draw logic inside the test functions themselves makes more sense. However, there are cases where drastically different debug draws are desired at different call sites of the test functions.

My experience tells me that it’s not possible to anticipate how others will use test results for their own debug draws, so I usually put trivial or common debug draw logic inside the test functions, and leave use-case specific debug draw logic to the client code calling the test functions. In the attack example, the TryAttack function body is considered client code that uses the test functions.

I generally follow this pattern:

void TryAttack()
{
  // facing
  FacingData facingData = PrepareFacingData();
  FacingResults facingResults = TestFacing(facingData);
  if (facingResults.IsFacingValid())
  {
    DebugDrawFacingSuccess(facingResults.GetSuccessInfo());
  }
  else
  {
    DebugDrawFacingFailure(facingResults.GetFailureInfo());
    return;
  }

  // weapon
  WeaponData weaponData = PrepareWeaponData();
  WeaponResults weaponResults = TestWeaponReady(weaponData);
  if (weaponResults.IsWeaponReady())
  {
    DebugDrawWeaponSuccess(weaponResults.GetSuccessInfo());
  }
  else
  {
    DebugDrawWeaponFailure(weaponResults.GetFailureInfo());
    return;
  }

  // path
  PathData pathData = PareparePathData();
  PathResults pathResults = TestPathClear(pathData);
  if (pathResults.IsPathClear())
  {
    DebugDrawPathSuccess(pathResults.GetSuccessInfo());
  }
  else
  {
    DebugDrawPathFailure(pathResults.GetFailureInfo());
    return;
  }

  // final logic
  Attack();
}

This pattern is, again, in the early-out style.

If we use the single-point-of-return style, the code above can turn into this:

void TryAttack()
{
  FacingData facingData = PrepareFacingData();
  FacingResults facingResults = TestFacing(facingData);
  if (facingResults.IsFacingValid())
  {
    DebugDrawFacingSuccess(facingResults.GetSuccessInfo());

    WeaponData weaponData = PrepareWeaponData();
    WeaponResults weaponResults = TestWeaponReady(weaponData);
    if (weaponResults.IsWeaponReady())
    {
      DebugDrawWeaponSuccess(weaponResults.GetSuccessInfo());

      PathData pathData = PareparePathData();
      PathResults pathResults = TestPathClear(pathData);
      if (pathResults.IsPathClear())
      {
        DebugDrawPathSuccess(pathResults.GetSuccessInfo());
      }
      else
      {
        DebugDrawPathFailure(pathResults.GetFailureInfo());
      }
    }
    else
    {
      DebugDrawWeaponFailure(weaponResults.GetFailureInfo());
    }
  }
  else
  {
    DebugDrawFacingFailure(facingResults.GetFailureInfo());
  }

  Attack();
}

The call to DebugDrawFacingFailure is all the way down inside the bottom else block. This is bad in terms of code locality. When I see the call to DebugDrawFacingFailure at the end, I’d have to trace all the way up to find its corresponding condition test.

There are single-point-of-return alternatives that can improve debug draw locality, but it’s still always going to be a challenge to make clean cuts to separate code into chunks that fully contain reference to individual tests. Later test chunks will always need to reference earlier test results.

Forcing All Debug Draws

Sometimes it’s preferable to force debug draws for all test results, even when early tests fail. In this case, we don’t care about the effect of short-circuit evaluation any more.

This is the pattern I follow that adds a flag to force all debug draws, which in turn could be toggled by a debug option:

void TryAttack(bool forceAllDebugDraws)
{
  bool anyTestFailed = false;

  // facing
  const FacingResults facingResults = TestFacing();
  if (facingResults.IsFacingValid())
  {
    DebugDrawFacingSuccess(facingResults.GetSuccessInfo());
  }
  else
  {
    DebugDrawFacingFailure(facingResults.GetFailureInfo());
    anyTestFailed = true;
    if (!forceAllDebugDraws)
      return;
  }

  // weapon
  const WeaponResults weaponResults = TestWeaponReady();
  if (weaponResults.IsWeaponReady())
  {
    DebugDrawWeaponSuccess(weaponResults.GetSuccessInfo());
  }
  else
  {
    DebugDrawWeaponFailure(weaponResults.GetFailureInfo());
    anyTestFailed = true;
    if (!forceAllDebugDraws)
      return;
  }

  // path
  const PathResults pathResults = TestPathClear();
  if (pathResults.IsPathClear())
  {
    DebugDrawPathSuccess(pathResults.GetSuccessInfo());
  }
  else
  {
    DebugDrawPathFailure(pathResults.GetFailureInfo());
    anyTestFailed = true;
    if (!forceAllDebugDraws)
      return;
  }

  // we'd only get here if total debug draw is forced
  // don't perform attack if any test has failed
  if (anyTestFailed)
    return;

  // final logic
  Attack();
}

If the flag to force all debug draws is set to true, all condition tests as well as debug draws will be executed. But the final Attack function call still wouldn’t be reached, because it’s guarded by a flag keeping track of whether any test has failed.

You might have already foreseen that if the number of tests grow large, we can end up having a lot of duplicate code and logic structure, respectively the use of anyTestFailed & forceAllDebugDraws inside the else blocks, and if statements branching into calling success & failure debug draws.

If you’re willing to make a sacrifice to prepare a single master data struct at the start, which is to be passed into all test functions declared with the same signature, plus a master results struct that holds all test info for debug draws, here’s one alternative pattern for your consideration:

// these are C++ function pointers
// if you're using C#, think of them as delegates
bool (* TestFunc) (const Data &data, Results &results);
bool (* DebugDrawFunc) (const Results &results);

struct TestSpec
{
  TestFunc m_func;
  DebugDrawFunc m_debugDrawSuccess;
  DebugDrawFunc m_debugDrawFailure;
};

void TryAttack(bool forceAllDebugDraws)
{
  bool anyTestFailed = false;

  // define sets of test function and debug draw functions
  TestFuncSpec testFuncs[] =
  {
    { // facing
      TestFacing,
      DebugDrawFacingSuccess,
      DebugDrawFacingFailure
    },
    { // weapon
      TestWeaponReady,
      DebugDrawWeaponSuccess,
      DebugDrawWeaponFailure
    },
    { // path
      TestPathClear,
      DebugDrawPathSuccess,
      DebugDrawPathFailure
    }
  }

  // iterate through each test
  Data masterData = PrepareMasterData();
  Results masterResults;
  bool anyTestFailed = false;
  for (TestFuncSpec &spec : testFuncs)
  {
    bool success = spec.m_func(masterData, masterResults);
    if (success)
    {
      spec.m_debugDrawSuccess(masterResults);
    }
    else
    {
      spec.m_debugrRawFailure(masterResults);
      anyTestFailed = true;
      if (!forceAllDebugDraws)
        return;
    }
  }

  if (anyTestFailed)
    return;

  Attack();
}

When a new test function and its success & failure debug draw functions are defined, simply add the function set to the testFuncs array. There is only one shared code structure (the range-based for loop) that runs the tests, selects the success or failure debug draw functions to call, and optionally performs early outs.

Finally, if the length of the TryAttack function grows to a point where the purpose of the function is not trivially clear any more. Recall a refactored variation above where all the condition tests are extracted into a separate CanAttack function:

void TryAttack()
{
  if (CanAttack())
    Attack();

  PostAttackTry();
}

This seems like a good change no matter how the conditional tests are done, as it makes the intention of TryAttack crystal clear to the reader. I’d do this if readability is compromised due to function length.

Summary

That’s it! I’ve shared the coding patterns I think help make code readable and debuggable.

Using the early-out style as foundation, I’ve shown how to write debuggable conditions, achieve debug draw locality, and optionally force all debug draws.

I don’t consider these patterns exciting nor groundbreaking, but I find them very useful, and I hope you do, too.

The post Readable & Debuggable Multi-Condition Game Code first appeared on Ming-Lun "Allen" Chou | 周明倫.

Game Math: Deriving the Slerp Formula

Allen Chou — Fri, 18 May 2018 06:30:40 +0000

This post is part of my Game Math Series.

It occurred to me that the entire time I’ve been working with quaternions, I have never read or learned about the derivation of the formula for slerp, spherical linear interpolation. I just learned the final formula and have been using it.

Upon a preliminary search I couldn’t seem to immediately find a straightforward derivation, either (at least not one that fits in the context of game development). So I thought it might be a fun exercise to derive it myself.

As it turns out, it is indeed fun and could probably serve as an interesting trigonometry & vector quiz question!

A quick recap: slerp is an operation that interpolates between two vectors along the shortest arc (in any dimension higher than 1D). It takes as input the two vectors to interpolate between plus an interpolation parameter:

where is the angle between the two vectors:

If the interpolation parameter changes at a constant rate, the angular velocity of the slerp result is also constant. If is set to , it means the slerp result is “the 25% waypoint on the arc from to : the angle between and the slerp result is , and the angle between and the slerp result is .

In the context of game development, slerp is typically used to interpolate between orientations represented by quaternions, which can be expressed as 4D vectors. In this case the shortest arc slerp interpolates across lies on a 4D hypersphere.

As mentioned before, this formula can be used on any vectors in any dimension higher than 1D. So it can also be used to interpolate between two 3D vectors along a sphere, or between two 2D vectors along a circle.

In the context of game development, we almost exclusively work with unit quaternions. So in my derivation, I make the assumption that the vectors we are working with are all unit vectors. The flow of the derivation should be pretty much the same even if the vectors are not unit vectors.

Without further ado, here’s the derivation.

The Derivation

Let be the results of slerp:

And let be the angle between and .

Knowing that the angle between and is , and the angle between and is , we can come up with this figure:

Here’s the strategy. We build a pair of orthogonal axes and from and . Then, we use the parametric circle formula to find :

Since is already a unit vector that convenient lies on the horizontal axis in the figure, let’s just pick . So then can be found by taking away the component in that is parallel to and normalizing the remainder:

Now plug and back into the parametric circle formula:

And voila! We have our slerp formula.

Edit: Eric Lengyel has pointed out there’s another way to derive the slerp formula using similar triangles, presented in his Mathematics for 3D Game Programming and Computer Graphics, 3rd ed., Section 4.6.3.

The post Game Math: Deriving the Slerp Formula first appeared on Ming-Lun "Allen" Chou | 周明倫.

Game Math: Swing-Twist Interpolation (…Sterp?)

Allen Chou — Sun, 13 May 2018 16:25:02 +0000

This post is part of my Game Math Series.

Source files are on GitHub.
Shortcut to sterp implementation.
Shortcut to code used to generate animations in this post.

An Alternative to Slerp

Slerp, spherical linear interpolation, is an operation that interpolates from one orientation to another, using a rotational axis paired with the smallest angle possible.

Quick note: Jonathan Blow explains here how you should avoid using slerp, if normalized quaternion linear interpolation (nlerp) suffices. Long store short, nlerp is faster but does not maintain constant angular velocity, while slerp is slower but maintains constant angular velocity; use nlerp if you’re interpolating across small angles or you don’t care about constant angular velocity; use slerp if you’re interpolating across large angles and you care about constant angular velocity. But for the sake of using a more commonly known and used building block, the remaining post will only mention slerp. Replacing all following occurrences of slerp with nlerp would not change the validity of this post.

In general, slerp is considered superior over interpolating individual components of Euler angles, as the latter method usually yields orientational sways.

But, sometimes slerp might not be ideal. Look at the image below showing two different orientations of a rod. On the left is one orientation, and on the right is the resulting orientation of rotating around the axis shown as a cyan arrow, where the pivot is at one end of the rod.

If we slerp between the two orientations, this is what we get:

Mathematically, slerp takes the “shortest rotational path”. The quaternion representing the rod’s orientation travels along the shortest arc on a 4D hypersphere. But, given the rod’s elongated appearance, the rod’s moving end seems to be deviating from the shortest arc on a 3D sphere.

My intended effect here is for the rod’s moving end to travel along the shortest arc in 3D, like this:

The difference is more obvious if we compare them side-by-side:

This is where swing-twist decomposition comes in.

Swing-Twist Decomposition

Swing-Twist decomposition is an operation that splits a rotation into two concatenated rotations, swing and twist. Given a twist axis, we would like to separate out the portion of a rotation that contributes to the twist around this axis, and what’s left behind is the remaining swing portion.

There are multiple ways to derive the formulas, but this particular one by Michaele Norel seems to be the most elegant and efficient, and it’s the only one I’ve come across that does not involve any use of trigonometry functions. I will first show the formulas now and then paraphrase his proof later:

Given a rotation represented by a quaternion and a twist axis , combine the scalar part from the projection of onto to form a new quaternion:

We want to decompose into a swing component and a twist component. Let the denote the swing component, so we can write . The swing component is then calculated by multiplying with the inverse (conjugate) of :

Beware that and are not yet normalized at this point. It’s a good idea to normalize them before use, as unit quaternions are just cuter.

Below is my code implementation of swing-twist decomposition. Note that it also takes care of the singularity that occurs when the rotation to be decomposed represents a 180-degree rotation.

public static void DecomposeSwingTwist
(
Quaternion q,
Vector3 twistAxis,
out Quaternion swing,
out Quaternion twist
)
{
Vector3 r = new Vector3(q.x, q.y, q.z);

// singularity: rotation by 180 degree
if (r.sqrMagnitude < MathUtil.Epsilon)
{
Vector3 rotatedTwistAxis = q * twistAxis;
Vector3 swingAxis =
Vector3.Cross(twistAxis, rotatedTwistAxis);

if (swingAxis.sqrMagnitude > MathUtil.Epsilon)
{
float swingAngle =
Vector3.Angle(twistAxis, rotatedTwistAxis);
swing = Quaternion.AngleAxis(swingAngle, swingAxis);
}
else
{
// more singularity:
// rotation axis parallel to twist axis
swing = Quaternion.identity; // no swing
}

// always twist 180 degree on singularity
twist = Quaternion.AngleAxis(180.0f, twistAxis);
return;
}

// meat of swing-twist decomposition
Vector3 p = Vector3.Project(r, twistAxis);
twist = new Quaternion(p.x, p.y, p.z, q.w);
twist = Normalize(twist);
swing = q * Quaternion.Inverse(twist);
}

Now that we have the means to decompose a rotation into swing and twist components, we need a way to use them to interpolate the rod’s orientation, replacing slerp.

Swing-Twist Interpolation

Replacing slerp with the swing and twist components is actually pretty straightforward. Let the and denote the quaternions representing the rod’s two orientations we are interpolating between. Given the interpolation parameter , we use it to find “fractions” of swing and twist components and combine them together. Such fractiona can be obtained by performing slerp from the identity quaternion, , to the individual components.

So we replace:

with:

From the rod example, we choose the twist axis to align with the rod’s longest side. Let’s look at the effect of the individual components and as varies over time below, swing on left and twist on right:

And as we concatenate these two components together, we get a swing-twist interpolation that rotates the rod such that its moving end travels in the shortest arc in 3D. Again, here is a side-by-side comparison of slerp (left) and swing-twist interpolation (right):

I decided to name my swing-twist interpolation function sterp. I think it’s cool because it sounds like it belongs to the function family of lerp and slerp. Here’s to hoping that this name catches on.

And here’s my code implementation:

public static Quaternion Sterp
(
Quaternion a,
Quaternion b,
Vector3 twistAxis,
float t
)
{
Quaternion deltaRotation = b * Quaternion.Inverse(a);

Quaternion swingFull;
Quaternion twistFull;
QuaternionUtil.DecomposeSwingTwist
(
deltaRotation,
twistAxis,
out swingFull,
out twistFull
);

Quaternion swing =
Quaternion.Slerp(Quaternion.identity, swingFull, t);
Quaternion twist =
Quaternion.Slerp(Quaternion.identity, twistFull, t);

return twist * swing;
}

Proof

Lastly, let’s look at the proof for the swing-twist decomposition formulas. All that needs to be proven is that the swing component does not contribute to any rotation around the twist axis, i.e. the rotational axis of is orthogonal to the twist axis.

Let denote the parallel component of to , which can be obtained by projecting onto :

Let denote the orthogonal component of to :

So the scalar-vector form of becomes:

Using the quaternion multiplication formula, here is the scalar-vector form of the swing quaternion:

Take notice of the vector part of the result:

This is a vector parallel to the rotational axis of . Both and are orthogonal to the twist axis , so we have shown that the rotational axis of is orthogonal to the twist axis. Hence, we have proven that the formulas for and are valid for swing-twist decomposition.

Conclusion

That’s all.

Given a twist axis, I have shown how to decompose a rotation into a swing component and a twist component.

Such decomposition can be used for swing-twist interpolation, an alternative to slerp that interpolates between two orientations, which can be useful if you’d like some point on a rotating object to travel along the shortest arc.

I like to call such interpolation sterp.

Sterp is merely an alternative to slerp, not a replacement. Also, slerp is definitely more efficient than sterp. Most of the time slerp should work just fine, but if you find unwanted orientational sway on an object’s moving end, you might want to give sterp a try.

Edit: Application in 2D

An application of swing-twist decomposition in 2D just came to mind.

If the twist axis is chosen to be orthogonal to the screen, then we can utilize swing-twist decomposition to use the orientation of objects in 3D to drive the rotation of 2D elements in screen space or some other data. The twist component represents exactly the portion of 3D rotation projected onto screen space.

However, in terms of performance, we might be better off just projecting a 3D object’s local axis onto screen space and find the angle between it and a screen space axis. But then again, the swing-twist decomposition approach doesn’t have the singularity the projection approach has when the chosen local axis becomes orthogonal to the screen.

The post Game Math: Swing-Twist Interpolation (…Sterp?) first appeared on Ming-Lun "Allen" Chou | 周明倫.

Unity Debug Draw Utility – Now with Shaded Styles

Allen Chou — Thu, 26 Oct 2017 07:11:55 +0000

View post on imgur.com

This post is part of my Game Programming Series.

Complete source code for the debug draw utility and Unity scene for generating the demo animation above can be found on GitHub. Here is a shortcut to the debug draw utility class. And here is a shortcut to the shaders.

Debug Draw Upgraded

A couple weeks ago, I documented how I implemented a wireframe Unity debug draw utility using cached mesh pools and vertex shaders.

Recently, I have upgraded the utility to now support various shaded styles, including solid color, flat-shaded, and smooth-shaded. This post is a documentation of my development process and how I solved some of the challenges on the way.

Extending The Mesh Factory

For each mesh rendered in wireframe style, the original mesh factory only needed to generate an array of unique vertices, along with an index array containing the vertex indices in either lines or line strip topology.

To generate a mesh to be rendered in solid color style, I reused the same unique vertex arrays, but the index arrays hadto be changed to contain vertex indices in triangle topology, three indices per triangle.

Once the generation of meshes for solid color style was done, I decided counter-intuitively to first implement the “fancier” smooth-shaded style before the flat-shaded style, because the former was actually an easier incremental change from the solid color style. Taking spheres for example, the vertex array actually still didn’t need to be changed; I just had to create an array of normals that is the exact copy of the vertices. Recall from the previous post that in order to reduce numbers of cached meshes, I offloaded scaling to the vertex shaders and just generated meshes that are unit primitives. The normal of a vertex of a smooth-shaded unit sphere is just conveniently identical to the vertex positional vector.

Figuring out the index arrays for other smooth-shaded primitive meshes wasn’t as straightforward as spheres, but it wasn’t too hard either. I still didn’t need to change most of the vertex arrays and just had to figure out the proper accompanying normal array and index array. Cones were a notable exception, because even with smooth-shaded style, they still have some normal discontinuity along the base edges, which required duplicates of the base edge vertices with different normals.

Finally moving onto the flat-shaded style, most primitives required me to modify the generation of vertex arrays, normal arrays, and index arrays. Arrays of unique vertices no longer worked, because a vertex shared by multiple faces (triangles, quads, circles, etc.) would have a different normal on each face. For each face, a new set of vertices had to be put into the vertex array. Different primitives required slightly different techniques to generate the vertices for each face. Taking spheres for example again, for each longitudinal strip, two triangles connecting to the poles plus two triangles per quad along the strip were needed. The normals were simply computed with cross products of any two non-parallel vectors connecting vertices in each face.

I generally followed this pattern for triangles:

Vector3[] aVert = new Vector3[numVerts];
Vector3[] aNormal = new Vector3[numNormals];
int[] aIndex = new int[numIndices];
int iVert = 0;
int iNormal = 0;
int iIndex = 0;
for (int i = 0; i < numIterations; ++i)
{
  int iTriStart = iVert;

  aVert[iVert++] = ComputeTriVert0(i);
  aVert[iVert++] = ComputeTriVert1(i);
  aVert[iVert++] = ComputeTriVert2(i);

  Vector3 tri01 = aVert[iTriStart + 1] - aVert[iTriStart];
  Vector3 tri02 = aVert[iTriStart + 2] - aVert[iTriStart];
  Vector3 triNormal = Vector3.Cross(tri01, tri02).normalized;
  aNormal[iNormal++] = triNormal;
  aNormal[iNormal++] = triNormal;
  aNormal[iNormal++] = triNormal;

  aIndex[iIndex++] = iTriStart;
  aIndex[iIndex++] = iTriStart + 1;
  aIndex[iIndex++] = iTriStart + 2;
}

And this pattern for quads:

Vector3[] aVert = new Vector3[numVerts];
Vector3[] aNormal = new Vector3[numNormals];
int[] aIndex = new int[numIndices];
int iVert = 0;
int iNormal = 0;
int iIndex = 0;
for (int i = 0; i < numIterations; ++i)
{
  int iQuadStart = iVert;

  aVert[iVert++] = ComputeQuadVert0(i);
  aVert[iVert++] = ComputeQuadVert1(i);
  aVert[iVert++] = ComputeQuadVert2(i);
  aVert[iVert++] = ComputeQuadVert3(i);

  Vector3 quad01 = aVert[iQuadStart + 1] - aVert[iQuadStart];
  Vector3 quad02 = aVert[iQuadStart + 2] - aVert[iQuadStart];
  Vector3 quadNormal = Vector3.Cross(quad01, quad02).normalized;
  aNormal[iNormal++] = quadNormal;
  aNormal[iNormal++] = quadNormal;
  aNormal[iNormal++] = quadNormal;
  aNormal[iNormal++] = quadNormal;

  aIndex[iIndex++] = iQuadStart;
  aIndex[iIndex++] = iQuadStart + 1;
  aIndex[iIndex++] = iQuadStart + 2;
  aIndex[iIndex++] = iQuadStart;
  aIndex[iIndex++] = iQuadStart + 2;
  aIndex[iIndex++] = iQuadStart + 3;
}

The Shaders

The positional portion of the vertex shader for all styles is actually identical, so I wanted to find a way to avoid creating an extra set of vertex and fragment shaders just in order to add the logic for normals. Then I found out about Unity’s shader variant feature. By using the shader_feature keyword and #ifdef‘s in the shaders, combined with the Material.EnableKeyword method, I was able to choose from a collection of variants generated from a single master shader at run time for each primitive mesh type. I used the NORMAL_ON keyword for the normal feature.

As shown below, only when the NORMAL_ON keyword is enabled are normals included in the vertex structs.

#pragma shader_feature NORMAL_ON

struct appdata
{
  float4 vertex : POSITION;

#ifdef NORMAL_ON
  float3 normal : NORMAL;
#endif
};

struct v2f
{
  float4 vertex : SV_POSITION;

#ifdef NORMAL_ON
  float3 normal : NORMAL;
#endif
};

The model-view matrix is used to transform vertex positions from object space into view space, but normals need to be transformed using the inverse transpose of the model-view matrix. Since the scaling is offloaded to the shader, I needed to fold in the scaling portion of the inverse transpose of the model-view matrix myself.

v2f vert (appdata v)
{
  v2f o;

  // ...

#ifdef NORMAL_ON
  float4x4 scaleInverseTranspose = float4x4
  (
    1.0f / _Dimensions.x, 0.0f, 0.0f, 0.0f, 
    0.0f, 1.0f / _Dimensions.y, 0.0f, 0.0f, 
    0.0f, 0.0f, 1.0f / _Dimensions.z, 0.0f, 
    0.0f, 0.0f, 0.0f, 1.0f
  );
  float4x4 m = mul(UNITY_MATRIX_IT_MV, scaleInverseTranspose);
  o.normal = mul(m, float4(v.normal, 0.0f)).xyz;
#endif

  return o;
}

I also used the shader_feature keyword to optionally activate the “cap shift/scaling” logic for cylinders and capsules. Recall from the previous post that in order not to generate a mesh for each possible height, only unit-height cylinder and capsule meshes are generated, and the caps are shifted towards the X-Z plane, scaled, and then shifted back to the final height. I used the CAP_SHIFT_SCALE keyword for this feature.

#pragma shader_feature CAP_SHIFT_SCALE

// (x, y, z) == (dimensionX, dimensionY, dimensionZ)
// w == capShiftScale
//   shifts 0.5 towards X-Z plane, scale by dimensions, 
//   and then shoft back 0.5 * capShiftScale)
float4 _Dimensions;

v2f vert (appdata v)
{
  v2f o;

#ifdef CAP_SHIFT_SCALE
  const float ySign = sign(v.vertex.y);
  v.vertex.y -= ySign * 0.5f;
#endif

  v.vertex.xyz *= _Dimensions.xyz;

#ifdef CAP_SHIFT_SCALE
  v.vertex.y += ySign * 0.5f * _Dimensions.w;
#endif

  o.vertex = UnityObjectToClipPos(v.vertex);

  // ...

  return o;
}

I noticed some Z-fighting between the two styles when I drew the same meshes twice, once in wireframe style and once in shaded style. It was actually an easy fix. I just added a small Z-bias to make sure the wireframe lines are always drawn in front of the shaded pixels.

float _ZBias;

v2f vert (appdata v)
{
  v2f o;

  // ...

  o.vertex = UnityObjectToClipPos(v.vertex);
  o.vertex.z += _ZBias;

  // ...

  return 0;
}

And finally here’s the fragment shader. It really doesn’t contain anything out of the ordinary, except that it remaps the vertex brightness from (0.0, 1.0) to (0.3, 1.0), because I really don’t like completely black pixels.

fixed4 frag (v2f i) : SV_Target
{
  fixed4 color = _Color;

#ifdef NORMAL_ON
  i.normal = normalize(i.normal);
  color.rgb *= 0.7f * i.normal.z + 0.3f; // darkest at 0.3f
#endif

  return color;
}

Conclusion

That’s it! I am pretty satisfied with the current Unity debug draw utility. It’s also easy to combine primitives to make more interesting shapes, such as the arrows shown in the demo animation above.

Potentially, the meshes for flat-shaded and smooth-shaded styles, generated from the mesh factory, can be used to implement a gizmo utility. But I’ll probably only do it when I really need it.

Stay tuned for more documentation of my future venture into Unity land.

Until next time!

The post Unity Debug Draw Utility – Now with Shaded Styles first appeared on Ming-Lun "Allen" Chou | 周明倫.