SUNDOG degausser

Humiliati
Published on May 27, 2025
Matter: I trained a tiny agent to align using a succinct equation of torque and shadow; no rewards, no goals, just resonance. After getting banned from lesswrong, r/artificialintelligence and r/singularity for irrationally being downvoted to oblivion, with zero feedback. I wanted to share what I built; how I tested it, and why I think it matters.

This project started with the idea that alignment doesn’t need to be about maximizing rewards; maybe it can emerge from structure, tension, and feedback. More specifically the project started out from real life practice and I've struggled for a long time to articulate it in a way the neckbeards can understand. Building a supercomputer out of town and using local talent I've had to adapt my language so that we could execute on task and in doing so I learned how to communicate with the in such a way that we could generate this simulation despite me having no affiliation nor education. What I was able to teach the local talent in 2 weeks was trigonomixs for bending conduit and acoustic ceiling installation for the suspension and trapeze of heavy equipment. To describe procedure for installing a plumb all thread rod (which one would suspend the necessary HVAC equipment for Sub-Zero temperatures) I had to devise a system of lasers and spotlights in a dark room to shoot a plum screw in the ceiling. In doing so the sundog hit me like an epiphany and gpt refined the expression. Together we built the program out of the theorem within the Denver public library using interpreted visual stimulus (sundog signatures) to dial in the details of the simulation and express the magnitude of our equation.

Core Idea:

We train agents not to reach for a target but to align to it using indirect signals only; specifically: -Shadow convergence (how light wraps around a mirrored pole) -Torque feedback (what the pole “feels” as it twists)

No explicit reward signal. No visibility of the goal. Just the geometry of the environment pushing back.

The shadow agent learns in less than 100kb:

Setup:

We used MuJoCo to simulate a vertical pole (like an articulated screwdriver) extending toward a ceiling with various fields; -A fixed laser beam (plumb) -Structured ceiling patterns: harmonic waveforms, golden-ratio spirals, and “hurricane” storm geometries made of 1000+ spheres.

The tip of the pole is a reflective surface that casts light and torque-based shadows across this environment. The agent only sees: -Its occluded tip position -Torque at each joint -Shadow spread across the ceiling

×__× The Theorem ^__^

We defined a new alignment metric:

In words:

If the shadow field is responsive to torque input, the agent has found resonance — alignment is emergent, not imposed.

How I explained it to the ESL help: if halo big, too far. If halo small and tight, right on baby. () means pipe stretch )( means pipe fold @ means pipe shrink.

◇~° Experiments °~◇

We tested three agents:

DOA (Direct Observation Agent): Gets full visibility and direct reward

TSA (Torque-Shadow Agent): No target view, only shadows + torque

RPB (Random Baseline): Control group with no feedback

Metrics we tracked:

Bloom Spread (S): How wide the shadow grows before convergence

Tip Error (A): Distance from goal

Stability (T): Torque variance near contact

Total Reward (π): (for comparison only — not trained on this)

Results

TSA consistently converged to inferred targets without direct reward.

The best behavior emerged in harmonic ceiling fields, where alignment was smooth and resonant.

In hurricane fields, the TSA recovered alignment after disturbance — showing surprising robustness.

DOA got there faster, but broke down more often under occlusion or perturbation.

You can see the plots:

https://imgur.com/gallery/sundog-theorem-signatures-vGEnjIa

Sundog Signature (multi-episode bloom collapse)

& proof sundog alignment is non deterministic:

Sundog in Hurricane

8=====D

Why This Might Matter

Instead of trying to brute-force pitiful "human" values into agents via rewards, we could build environments that teach alignment through resonance, rejoicing and dancing. https://basilism.com/sundog-theorem-hx

Think: geometric musical tuning instead of behavioral programming.

Agents align to fields, not endpoints.

Torque becomes language. Shadow becomes signal.

It’s not about optimizing — it’s about listening to the world around us so we dont build paperclip factories.

Program Sim sundog download:

Github.com/humiliati/sundog

Category

Share Video

  • 560 x 315
  • 640 x 360
  • 853 x 480
  • 1280 x 720

Add to

Flag Video

Rate video

Rate video

DISCLAIMER

The content presented in this stream and/or video may be satirical in nature for entertainment purposes. It may contain realistic scenarios that may include themes of racism, anti-semitism, anti-LGBT sentiment and even elements such as death threats, all purely in the context of parody. In addition, this content may depict or refer to acts of violence in a satirical manner. Shock factor is a common and deliberate element used in these displays to emphasise the satirical message. By continuing to view this content, you acknowledge that you understand the satirical nature of this content, including the depiction of violence and the use of shock factor, and agree that you will not use or interpret this content outside of its intended context. Please remember that humour and satire are complex; they are not intended to belittle or demean, but to engage and challenge social norms through exaggeration. If you have any concerns about content, please feel free to engage in constructive dialogue or report issues to GTV staff.

Up next
Autoplay