Designing a data reflection system for my game engine, inspired by Unreal Header Tool

I always loved the fact that Unreal Engine's primary scripting language is C++, but for this to be possible, we need a way to get type metadata to the editor.

C++ is by far my favorite programming language, and I absolutely love how it is the primary scripting language in Unreal Engine. I knew from the start that I wanted my game engine to be similar. Below is an example of what defining a game object looks like in my engine, and exposing variables to the editor.

/* HCLASS
arg 1 tells the reflection tool whether the object is a class, or a component
arg 2 is simply the name of the object
arg 3 is the name of the object it derives from,
so the editor can show all of the inherited variables as well
*/
HCLASS(Class, TestEntity, Entity)
class TestEntity : public Entity {
public:
	virtual void PrePlay() override;
	virtual void BeginPlay() override;

    // arg 1 is variable type, arg 2 is variable name
	HVARIABLE(float, m_cameraSpeed)
	float m_cameraSpeed = 0.005f;

	HCOMPONENT(CameraComponent)
	std::shared_ptr<CameraComponent> m_camera;
};

HENUM(uint8_t, BodyShape)
enum BodyShape : uint8_t
{
    Box,
    Sphere,
};

HCLASS(Component, RigidBodyComponent, Component)
class RigidBodyComponent : public Component
{
public:
    virtual void BeginPlay() override;
    virtual void OnDelete() override;

    HVARIABLE(bool, m_static);
    bool m_static;

    HVARIABLE(ENUM(BodyShape), m_shape);
    BodyShape m_shape;

    HVARIABLE(Vector3, m_halfExtents);
    Vector3 m_halfExtents = Vector3(1.f, 1.f, 1.f);

    uint32_t m_bodyID;
}

Why do we need reflection?

The engine's editor is compiled before, and is completely separate from game projects made in the engine, therefore it has no idea about any of the entities in the game, or the game's code at all for the most part. We need a robust way to tell the editor all of the information about the game. While C++ currently has very limited type information data through RTTI, it is not nearly enough for a engine editor to have any idea what is going on, languages like C# have very in depth reflection systems that make this process much simpler as far as I know.

Unreal Engine solves this problem with a program called Unreal Header Tool, while I'm not sure about the internals of the program, my understanding is that when a project compile / reload is initiated in the editor, it parses all of the game project's header files for macro's like UPROPERTY, UENUM, etc. and generates code in the game's .dll that holds information about these types that the editor can query later. I'm unsure of whether UHT simply works off text parsing, or utilizes something like libclang to parse Abstract Syntax Trees.

CryEngine / Lumberyard employed a similar system but it is actually a python script that just parses the text for all of the header files in the game project, and generates code in the game .dll similarly to UHT.

For my initial implementation, I decided to try making something similar to UHT. Whenever a project build / hot reload is initiated within my editor, 'Harmony-Reflection.exe' is ran before the game .dll is actually compiled.

The engine itself contains a header file 'reflection.h', this header contains a bunch of structs that the editor will use to query game type information.

struct ReflectedVariable
{
	std::string variableName;
	uint32_t variableOffset;
	uint32_t variableSize;
	VariableTypes variableType;
	std::string variableTypeName;
	std::string variableTemplateType;
};

enum ReflectedClassType {
	TYPE_CLASS,
	TYPE_COMPONENT
};

struct ReflectedClass
{
	ReflectedClassType classType;
	std::string className;
	std::string baseClass;
	std::vector<ReflectedVariable> classVariables;
	std::vector<std::string> classComponents;
	int defaultComponentAmount;
	std::string classHeader;
}

It also contains a bunch of helper functions that the editor will use

    void Init();

	ReflectedClass GetReflectedClass(std::string className) 
	{
		return m_classes[className];
	}

	ReflectedBitfield GetReflectedBitfield(std::string bitfieldName)
	{
		return m_bitfields[bitfieldName];
	}

	ReflectedEnum GetReflectedEnum(std::string enumName)
	{
		return m_enums[enumName];
	}

	int GetClassAmount() { return m_classes.size(); }

	std::unordered_map<std::string, ReflectedClass> GetAllClasses() { return m_classes; }

	std::shared_ptr<Entity> CreateEntityInstance(const std::string& entityTypeName) {
		auto it = m_entityFactory.find(entityTypeName);
		if (it != m_entityFactory.end()) {
			auto newEntity = it->second();
			newEntity->Init(newEntity);
			newEntity->PrePlay();
			return newEntity;
		}
		return nullptr;
	}

	std::shared_ptr<Component> CreateComponentInstance(const std::string& componentTypeName) {
		auto it = m_componentFactory.find(componentTypeName);
		if (it != m_componentFactory.end()) {
			auto newComponent = it->second();
			return newComponent;
		}
		return nullptr;
	}

The important thing to note here, is that there is no .cpp file for this header anywhere in the engine, and the Init() function is not implemented. This is because the reflection tool works by creating the .cpp file inside the game's project directory, and implements the Init() function and generates all the code for the project inside that function. As I understand it, UHT works differently, it generates code in separate files for each class registered in the reflection system. For example if you had a class TestWeapon, it would create all of its generated code in TestWeapon_generated.cpp. UHT's approach is definitely better and will lead to shorter compile times in the case of only one entity being modified between builds.

My reflection tool is composed of two functions, the first simply parses every header, in both the game and engine source directories, searches for macro's like HCLASS and HVARIABLE, and then populates arrays of types 'ReflectedClass', 'ReflectedEnum', etc. shown previously. The second function just creates the reflection.cpp file in the game's source directory, and generates a bunch of code in the Init() function that populates hash maps of these types.

for (ReflectedClass curClass : classesToGenerate)
{
  bool createdDefaultObject = false;
  std::string defaultObjectName = curClass.className + "_default";
  
  std::string curClassVarName = curClass.className + "_class";
  outputFile << "    ReflectedClass " << curClassVarName << ";" << std::endl;
  outputFile << "    " << curClassVarName << ".className = \"" << curClass.className << "\";" << std::endl;
  outputFile << "    " << curClassVarName << ".classType = " << "(ReflectedClassType)" << curClass.classType << ";" << std::endl;
  outputFile << "    " << curClassVarName << ".defaultComponentAmount = " << curClass.defaultComponentAmount << ";" << std::endl;
  outputFile << "    " << curClassVarName << ".baseClass = " << "\"" << curClass.baseClass << "\";" << std::endl;
  outputFile << std::endl;
  
  for (ReflectedVariable curVariable : curClass.classVariables) 
  {
    std::string curVariableName = curClass.className + "_" + curVariable.variableName;
    outputFile << "    ReflectedVariable " << curVariableName << ";" << std::endl;
    outputFile << "    " << curVariableName << ".variableName = \"" << curVariable.variableName << "\";" << std::endl;
    outputFile << "    " << curVariableName << ".variableType = " << "(VariableTypes)" << curVariable.variableType << ";" << std::endl;
    outputFile << "    " << curVariableName << ".variableTypeName = \"" << curVariable.variableTypeName << "\";" << std::endl;

    // We create an instance of the class here, so that we can grab offsets of variables inside the class
    if (!createdDefaultObject) 
    {
      outputFile << "    " << curClass.className << " " << defaultObjectName << ";" << std::endl;
      createdDefaultObject = true;
    }
    
    outputFile << "    " << curVariableName << ".variableOffset = " << "reinterpret_cast<uintptr_t>(&(" << defaultObjectName + "." << curVariable.variableName << ")) - ";
    outputFile << "reinterpret_cast<uintptr_t>(&" << defaultObjectName << ");" << std::endl;
    outputFile << "    " << curClassVarName << ".classVariables.push_back(" << curVariableName << ");" << std::endl;
    
    for (std::string componentName : curClass.classComponents) 
    {
      outputFile << "    " << curClassVarName << ".classComponents.push_back(\"" << componentName << "\");" << std::endl;
    }
  }
  
  outputFile << std::endl;
  outputFile << "    m_classes[\"" << curClass.className << "\"] = "  << curClassVarName << ";" << std::endl;
  outputFile << std::endl;
}

outputFile << "}" << std::endl

While I left out a lot of the gory details, this is basically everything we need to make our editor work like an actual scene editor.

You can see here that the editor is aware of all the entities in the game project, and all components as well. You can add and remove components from an entity at runtime, and modify any property on an entity as long as its registered in the reflection system. It was relatively easy to build a serialization system on top of the reflection system, that allows saving and loading of scenes and entity prefabs.

The reflection tool gathering data by parsing header files line by line, is kind of a nightmare. I have plans on rewriting the reflection tool completely using libclang, and will write another blog post about that when I get around to it.