Built a Mixture of Experts (MoE) neural network that learns addition and subtraction by routing problems to specialized experts. ---CODE ATTACHED BELOW--- How it works: A router network looks at each problem (e.g., "5 + 3" or "20 - 7"). It decides which expert should handle it. Expert Add specializes in addition. Expert Sub specializes in subtraction. The router learns to send addition problems to Expert Add and subtraction problems to Expert Sub. The cool part: All three networks train together. The router learns routing, and each expert learns its operation. Over time, the router gets better at choosing the right expert. Visualization: Real-time training dashboard with: Live network architecture visualization, Training metrics (loss, accuracy, expert usage), Interactive testing after training.