SUNDOG degausser
Humiliati
Published on May 27, 2025
Matter: I trained a tiny agent to align using a succinct equation of torque and shadow; no rewards, no goals, just resonance. After getting banned from lesswrong, r/artificialintelligence and r/singularity for irrationally being downvoted to oblivion, with zero feedback. I wanted to share what I built; how I tested it, and why I think it matters.
This project started with the idea that alignment doesn’t need to be about maximizing rewards; maybe it can emerge from structure, tension, and feedback. More specifically the project started out from real life practice and I've struggled for a long time to articulate it in a way the neckbeards can understand. Building a supercomputer out of town and using local talent I've had to adapt my language so that we could execute on task and in doing so I learned how to communicate with the in such a way that we could generate this simulation despite me having no affiliation nor education. What I was able to teach the local talent in 2 weeks was trigonomixs for bending conduit and acoustic ceiling installation for the suspension and trapeze of heavy equipment. To describe procedure for installing a plumb all thread rod (which one would suspend the necessary HVAC equipment for Sub-Zero temperatures) I had to devise a system of lasers and spotlights in a dark room to shoot a plum screw in the ceiling. In doing so the sundog hit me like an epiphany and gpt refined the expression. Together we built the program out of the theorem within the Denver public library using interpreted visual stimulus (sundog signatures) to dial in the details of the simulation and express the magnitude of our equation.
Core Idea:
We train agents not to reach for a target but to align to it using indirect signals only; specifically: -Shadow convergence (how light wraps around a mirrored pole) -Torque feedback (what the pole “feels” as it twists)
No explicit reward signal. No visibility of the goal. Just the geometry of the environment pushing back.
The shadow agent learns in less than 100kb:
Setup:
We used MuJoCo to simulate a vertical pole (like an articulated screwdriver) extending toward a ceiling with various fields; -A fixed laser beam (plumb) -Structured ceiling patterns: harmonic waveforms, golden-ratio spirals, and “hurricane” storm geometries made of 1000+ spheres.
The tip of the pole is a reflective surface that casts light and torque-based shadows across this environment. The agent only sees: -Its occluded tip position -Torque at each joint -Shadow spread across the ceiling
×__× The Theorem ^__^
We defined a new alignment metric:
In words:
If the shadow field is responsive to torque input, the agent has found resonance — alignment is emergent, not imposed.
How I explained it to the ESL help: if halo big, too far. If halo small and tight, right on baby. () means pipe stretch )( means pipe fold @ means pipe shrink.
◇~° Experiments °~◇
We tested three agents:
DOA (Direct Observation Agent): Gets full visibility and direct reward
TSA (Torque-Shadow Agent): No target view, only shadows + torque
RPB (Random Baseline): Control group with no feedback
Metrics we tracked:
Bloom Spread (S): How wide the shadow grows before convergence
Tip Error (A): Distance from goal
Stability (T): Torque variance near contact
Total Reward (π): (for comparison only — not trained on this)
Results
TSA consistently converged to inferred targets without direct reward.
The best behavior emerged in harmonic ceiling fields, where alignment was smooth and resonant.
In hurricane fields, the TSA recovered alignment after disturbance — showing surprising robustness.
DOA got there faster, but broke down more often under occlusion or perturbation.
You can see the plots:
https://imgur.com/gallery/sundog-theorem-signatures-vGEnjIa
Sundog Signature (multi-episode bloom collapse)
& proof sundog alignment is non deterministic:
Sundog in Hurricane
8=====D
Why This Might Matter
Instead of trying to brute-force pitiful "human" values into agents via rewards, we could build environments that teach alignment through resonance, rejoicing and dancing. https://basilism.com/sundog-theorem-hx
Think: geometric musical tuning instead of behavioral programming.
Agents align to fields, not endpoints.
Torque becomes language. Shadow becomes signal.
It’s not about optimizing — it’s about listening to the world around us so we dont build paperclip factories.
Program Sim sundog download:
Github.com/humiliati/sundog
