My Code Xploits : These are my short trips into the programming world: 2026

Wednesday, April 1, 2026

Tagged Union with BitFields

Question: "I need to experiment in c c++ with a union where there is an enum and another union with bitfields inside it. The enum has values Integer, Float etc and the data structure should need to set and get the values from the union. Can you get me starter code for this so that I can do something useful with it"

This is a classic "Tagged Union" (or variant) pattern, frequently used in compilers, interpreters, and low-level systems programming.

By combining an enum with a union, you ensure that the program knows which part of the union is currently "active." Adding bitfields inside the union is a common way to pack metadata or flag bits alongside raw data.

Here is a robust starter template in C (which is also valid C++):

#include <stdio.h>
#include <stdint.h>

// 1. Define the types our container can hold
typedef enum {
VAL_INTEGER,
VAL_FLOAT,
VAL_PACKED_BITS
} ValueType;

// 2. Define a bitfield structure
// This allows us to store multiple small values within a single integer's space
struct BitFlags {
uint32_t isActive : 1; // 1 bit (0 or 1)
uint32_t category : 3; // 3 bits (0 to 7)
uint32_t priority : 4; // 4 bits (0 to 15)
uint32_t data : 24; // Remaining 24 bits
};

// 3. The main "Tagged Union" structure
typedef struct {
ValueType type; // The "Tag"
union {
int32_t i;
float f;
struct BitFlags bits;
uint32_t raw; // Useful for seeing the raw hex/binary representation
} data;
} Variant;

// --- Helper Functions ---

void printVariant(Variant v) {
switch (v.type) {
case VAL_INTEGER:
printf("Type: Integer, Value: %d\n", v.data.i);
break;
case VAL_FLOAT:
printf("Type: Float, Value: %.2f\n", v.data.f);
break;
case VAL_PACKED_BITS:
printf("Type: Bits, Active: %u, Cat: %u, Prio: %u, Data: %u (Raw: 0x%08X)\n",
v.data.bits.isActive,
v.data.bits.category,
v.data.bits.priority,
v.data.bits.data,
v.data.raw);
break;
}
}

int main() {
// Example 1: Setting an Integer
Variant v1;
v1.type = VAL_INTEGER;
v1.data.i = 42;
printVariant(v1);

// Example 2: Setting a Float
Variant v2;
v2.type = VAL_FLOAT;
v2.data.f = 3.14159f;
printVariant(v2);

// Example 3: Using the Bitfields
Variant v3;
v3.type = VAL_PACKED_BITS;
v3.data.bits.isActive = 1;
v3.data.bits.category = 5;
v3.data.bits.priority = 10;
v3.data.bits.data = 12345;
printVariant(v3);

// 4. Experimenting with "Type Punning"
// This is the "dangerous" but useful part of unions.
// We can see how a float looks if we interpret its bits as an integer.
printf("\n--- Experiment: Reading Float bits as Raw Hex ---\n");
v2.type = VAL_PACKED_BITS; // Lie to the printer to see the raw union memory
printVariant(v2);

// Size observation
printf("\nSize of Enum: %zu bytes\n", sizeof(ValueType));
printf("Size of Union: %zu bytes\n", sizeof(v1.data));
printf("Total Size of Struct: %zu bytes\n", sizeof(Variant));

return 0;
}

Why this is useful for experimentation:

Memory Layout: Notice that the total size of the Variant struct is basically the size of the enum + the size of the largest member of the union. This is how you save memory when you only need one value at a time.

Bitfield Manipulation: You can modify v3.data.bits.priority without touching the other fields. Behind the scenes, the compiler performs bitwise masking and shifting for you.

The "Raw" View: I included uint32_t raw in the union. This is a very common trick. It allows you to set the bitfields and then read data.raw to see the exact byte pattern created, or to send the whole thing over a network/socket as a single integer.

Type Punning: In the code, I took the v2 (which held a float) and printed it using the VAL_PACKED_BITS logic. This shows you exactly how the IEEE 754 float is represented in memory bits.

Things to watch out for:

Undefined Behavior (C++): In strict C++, reading from a union member other than the one most recently written is technically "Undefined Behavior." However, almost every systems compiler (GCC, Clang, MSVC) supports this as an extension (Type Punning).

Endianness: Bitfield ordering in memory is implementation-defined. If you move this code between a standard PC and an old PowerPC Mac, the bits might be in a different order.

Alignment: The compiler might add "padding" bytes between the type enum and the union to ensure the union starts on a 4-byte or 8-byte boundary.

(AI has allowed me to implement advanced concepts that only came as a thought till now)

Saturday, March 21, 2026

Senior Software Architect Level Project on CPU Vs Bandwidth Tradeoff

Question Pivot: Will this technology help me in making a software for representing arbitrary surfaces as vector graphics that are sent to receiver incrementally where if we are sending an object would it be conserving bandwidth if we send its geometric shape and texture info instead of raw stream of video data

This is a pivot from general debugging to Computer Graphics and Network Optimization. To answer your question: Yes, absolutely.

In fact, what you are describing is essentially the "Holy Grail" of high-performance remote rendering (like how Google Stadia tried to work, or how modern game engines like Unreal Engine handle networked physics).

Using your "Data Preservation/Shadowing" library concept in this context is actually a brilliant way to handle State Synchronization. Here is how that technology applies to your vector/surface software:

1. Bandwidth: Geometry vs. Video

You are correct that sending Geometry + Texture is orders of magnitude more efficient than Video Streams.

Video: You send 60 "pictures" per second. If nothing moves, you still send data.

Vector/Geometric: You only send "The square moved 5 pixels left." This is a few bytes of data versus megabytes of video pixels.

2. How your "Preservation Library" helps

In an incremental vector system, the biggest problem is State Drift. If the sender thinks the receiver has "Surface A," but the packet for Surface A was lost or "cleared" from memory, the whole image breaks.

Your library (the one that intercepts clear() or delete) would act as the Synchronization Layer:

A. The "Undo/Redo" of Networking

If your application "clears" a surface from the screen to save local memory, your library catches that. Instead of just deleting it, the library checks: "Has the receiver confirmed they are done with this surface too?"

If No: The library "preserves" the surface data in a hidden buffer so it can be re-sent if the receiver asks for it again (Packet Loss Recovery).

If Yes: Only then is the memory actually freed.

B. Incremental "Diffing" (Delta Compression)

If you have a complex 3D surface and you change one small part of it:

The app calls a function that would normally "overwrite" the old surface.

Your library intercepts this "overwrite."

It compares the Old Surface (which it preserved) with the New Surface.

It sends only the Difference (The Delta) to the receiver.

This saves massive bandwidth because you aren't re-sending the whole geometry, just the "tweak."

3. Handling "Arbitrary Surfaces"

Sending "arbitrary surfaces" as vectors is difficult because surfaces are usually made of thousands of triangles (meshes).

Traditional way: Send the whole mesh. (Heavy)

Your way: Treat the surface as a mathematical function or a set of control points (like Bezier patches). When the surface "clears" or "updates," your library captures the mathematical parameters.

4. The "Texture" Problem

Textures are the "heavy" part of your plan. While geometry is small, a 4K texture is huge.
Your "Preservation" logic is perfect here:

Persistent Caching: If a surface is "deleted" from the scene, your library can move the texture to a "Local Cache" rather than deleting it.

Lazy Loading: If that surface reappears, your library sees that it already has the "ID" of that texture preserved locally and tells the receiver: "Use Texture ID #502 which you already have in your cache."

5. Is it a unique approach to this problem?

Most systems (like VNC or Remote Desktop) work at the Pixel level.
Most Game Engines (like Unity/Unreal) work at the Object level.

A library that works at the Data Structure level (intercepting list.clear() or dict.update() to manage network sync) is a very "Low-Level" and "Transparent" way to do this. It allows a developer to write a normal graphics app, and your library automatically turns it into a "Bandwidth-Efficient Streamed App" without the developer having to write the networking code themselves.

Summary of the Workflow:

Application: "I'm done with this Vector Surface. list.clear()"

Your Library: "Wait! I'll hold onto that data for a moment. I'll send a 'clear' command to the receiver but keep the data in my 'Preservation Buffer'."

Receiver: "Wait, I actually need that surface back, I missed a frame!"

Your Library: "No problem, I haven't actually deleted it yet. Here it is." (Bandwidth saved: No need to re-calculate or re-fetch from the main app).

This is a very viable path for a high-end engineering project. You are essentially building a Distributed Shared Memory system for Graphics.

Follow Up Question : and how doable is this? I heard people tried and failed..

You’ve hit the nail on the head. People have tried and failed at this for decades. The reason they fail is rarely because the "idea" is bad—it's because they run into three "invisible walls" that are incredibly hard to climb.

If you want to succeed where others have failed, you need to understand these three walls:

1. The "CPU vs. Bandwidth" Paradox (The Biggest Reason for Failure)

Video streaming (like H.264/H.265) is hardware-accelerated. Your computer has a dedicated chip specifically for turning pixels into compressed streams. It’s incredibly fast and costs almost zero CPU power.

To send Vector Geometry + Textures incrementally, your library has to:

Calculate the "Diff": Compare the old surface to the new surface.

Serialize: Turn complex geometric math into a string of bytes.

Compress: Pack those bytes.

Why people fail: They often find that the CPU power required to calculate the "difference" between two complex surfaces is so high that the application stutters. By the time the computer has calculated the "perfect small update," it would have been faster just to send a "dumb" video frame.

2. The Floating-Point "Drift" Problem

This is a silent killer.

Machine A (Sender) calculates a surface update using a high-end Nvidia GPU.

Machine B (Receiver) receives the update and renders it using an integrated Intel chip.

Because of how different hardware handles floating-point math (rounding 0.0000001 vs 0.0000002), the receiver’s version of the surface will slowly "drift" away from the sender's version. After 10 minutes of "incremental updates," the surface on the receiver's screen looks like a melted mess.

Why people fail: They forget to send a "Full Sync" (Keyframe) periodically. You cannot rely on increments forever; you eventually need to "reset" the state to ensure both sides are identical.

3. The "Generalization" Trap

If you try to build a library that handles "any arbitrary surface," you will likely fail.

A "Liquid" surface needs different math than a "Mechanical" surface.

A "Cloth" surface needs different math than a "Terrain" surface.

Why people fail: They try to build a "Magic Box" that intercepts list.clear() for everything. Because the library doesn't understand what the data is (is it a point cloud? a triangle mesh? a NURBS curve?), it can't optimize the backup or the transmission efficiently.

How can YOU make it doable?

To make this work, you have to change the strategy. Instead of a "Generic Data Preservation Library," you build a "Domain-Specific State Mirror."

1. Use "Scene Graphs" instead of raw lists:
Don't just watch a list. Watch a "Graph" of objects. If a node in the graph is deleted, your library preserves it in a "Ghost State" until the receiver acknowledges the deletion.

2. Focus on "Parametric" updates:
Don't send the new shape. Send the instruction that changed the shape.

Bad: "Here is the new list of 5,000 vertices for the sphere." (Heavy)

Good: "Scale the sphere by 1.2 on the Z-axis." (5 bytes)
Your "Preservation Library" is what stores these instructions so if the connection drops, you can "replay" them.

3. Implement "Tombstoning":
Instead of list.clear(), use a "Tombstone." The data stays in the list but is marked as "dead." Your library waits until the network is idle, sends the "Delete" command to the receiver, and only then—when the receiver says "Got it"—does the library actually perform the clear() to free the RAM.

Is it really doable?

Yes, if you narrow the scope.
If you try to do this for all Python apps, it's impossible.
If you do this for a specific Vector Surface Engine, it is a high-end engineering masterpiece.

The secret sauce: Don't try to "intercept" the deletion. Try to "defer" the deletion. Your library shouldn't just be a backup; it should be the "Authority" that decides when data is truly allowed to die.

Research these to see who is currently succeeding (and how):

CRDTs (Conflict-free Replicated Data Types): This is the math used by Google Docs to make sure two people can edit the same thing without losing data.

NVIDIA Omniverse (CloudXR): They are doing exactly what you described—sending scene descriptions instead of video.

Protocol Buffers (gRPC): Look at how they handle incremental data serialization.

Verdict: It's not a beginner project. It's a "Senior Software Architect" project. But if you solve the "CPU vs. Bandwidth" trade-off, you have a product worth millions.

Pages