Feed your curiosity
Hey Francesco!
You know, sometimes I get genuinely giddy thinking about the hidden gears of the internet β the stuff that just works seamlessly in the background, making our digital lives possible. And one of the most mind-blowing systems is how your browser really knows that google.com is actually Google, and not some sneaky imposter.
It all boils down to digital identity verification, mainly through what we call Public Key Infrastructure (PKI) and certificates. Imagine this: you want to send a secret message to someone, but you've never met them, and you don't have a pre-shared secret code. How do you establish trust? This was the monumental problem that Whitfield Diffie and Martin Hellman cracked in 1976 with their paper "New Directions in Cryptography." Diffie, a bit of a wanderer who dabbled in everything from linguistics to math, ended up at Stanford, where Hellman was a professor. Together, they developed the concept of public-key cryptography β the revolutionary idea that you could encrypt with one key (public) and decrypt with another (private), and crucially, you could exchange keys securely over an insecure channel.
Their work, and the RSA algorithm that followed shortly after, laid the foundation for digital signatures. This meant you could "sign" a piece of data with your private key, and anyone with your public key could verify that you were the one who signed it. This is the magic behind certificates: a Certificate Authority (CA) β like a digital notary β signs a certificate that binds your public key to your identity (e.g., google.com). When your browser connects to a website, it gets this certificate, and then it verifies the CA's signature using the CA's public key. But wait, how do you trust the CA? Ah, this is where the "chain of trust" comes in! Your operating system or browser comes pre-loaded with a handful of "root" CA certificates, which are self-signed and implicitly trusted. If a website's certificate is signed by a CA that's signed by another CA, and so on, all the way up to one of those pre-installed root certificates, then your browser says, "Okay, this identity checks out!"
Itβs mind-blowing because this entire, hierarchical trust model, built on complex math, underpins almost all secure communication on the web. We implicitly trust a few organizations (OS/browser vendors and root CAs) to maintain this bedrock of our digital security.
Now, here's a surprising connection to your research: As you dive into agentic AI and intelligent systems (thinking about the work at ETH's Agentic Systems Lab with Robert Jakob and Kevin O'Sullivan, or the CMU teams with Shuyan Zhou and Jing Yu Koh), how do these future AI agents establish trust with each other? Will they need their own PKI? Or maybe a more dynamic, ML-driven "web of trust," where an agent uses data science and reinforcement learning to assess the trustworthiness of other agents based on past interactions, observed behaviors, and even their "digital signatures" of provenance on information? Itβs not just about authenticating humans, but authenticating intelligences in a complex, multi-agent world. Suddenly, understanding how we built trust for the web offers a fascinating blueprint β or perhaps a challenge to rethink β for the agentverse.
Hey Francesco! Guess what I just stumbled upon? There's this super cool paper that just dropped (well, almost, it's dated 2026, which is kind of fun itself, like a sneak peek into the future!). It's called 'go-$m$HC' by Torque Dandachi and Sophia Diggs-Galligan, and it's tackling a problem that's been nagging ML researchers for a while, especially when we talk about complex neural networks.
You know how in deep learning, we often want to 'mix' information between different processing pathways or 'streams'? Think of residual connections, but way more dynamic and intelligent. The ideal way to do this mixing, ensuring no information is lost or amplified unevenly, is using something called a 'doubly stochastic matrix.' Imagine a matrix where every entry is positive, and all rows and columns sum to one β it's like a perfectly balanced recipe for combining ingredients. The complete set of all such matrices, called the 'Birkhoff polytope,' is notoriously hard to work with. Existing methods to fully represent all these mixing possibilities explode in complexity (factorially!) as you add more streams. It's like trying to perfectly map every single possible path through a massive, interconnected city β it just gets unmanageable fast. Or you could use simpler, faster approximations, but then you lose a lot of the expressive power, like taking a shortcut that misses all the cool landmarks.
That's where Torque and Sophia's 'aha!' moment comes in. They found a way to directly parameterize these complex matrices using something called 'generalized orthostochastic matrices.' Orthostochastic matrices are fascinating on their own; they connect to orthogonal matrices, which are all about rotations and reflections that preserve geometry β very elegant math. By generalizing them, Torque and Sophia figured out a method that is exact (meaning it covers all possible mixes in the Birkhoff polytope) and efficient (it scales much more gracefully, like $d^3$ instead of $d!$). And hereβs the really clever part: their method introduces a single hyperparameter, s. This s acts like a dial, letting you continuously choose between super efficient (maybe a bit less expressive) mixing and fully expressive (but potentially a bit more compute-heavy) mixing. It's a beautiful trade-off knob!
Now, about Torque Dandachi and Sophia Diggs-Galligan β from what I gather from the paper's tone and the cleverness of their approach, they seem like exactly the kind of researchers who love to dig into the mathematical foundations of ML. They're likely folks who bridge theoretical insights from linear algebra and optimization with practical deep learning challenges. While I don't have their full bios right in front of me for this specific publication date, I imagine them coming from places where this kind of rigorous, foundational work in ML is really valued, maybe drawing inspiration from the more theoretical machine learning groups at places like Stanford, ETH, or even Imperial, which often foster this kind of mathematical innovation in ML. They're not just throwing models at data; they're fundamentally rethinking how we build the blocks of these models.
Why does this matter so much? Well, their new 'go-$m$HC' method, built on the 'Manifold-Constrained Hyper-Connections' (mHC) framework, basically unlocks a new dimension for designing highly expressive and dynamic deep learning models. They showed it could recover expressivity similar to much slower methods at comparable FLOPs, and even achieved minimum theoretical loss on synthetic tasks while converging way faster. The fact that they validated it on a 30M parameter GPT-style model is huge β it means this isn't just a theoretical curiosity; it's a practical tool for scaling up language models and other complex architectures.
For you, Francesco, working in AI/ML and intelligent systems, this is super relevant. Imagine being able to design neural networks where the internal connections aren't fixed but can dynamically reconfigure themselves in a principled, mathematically guaranteed way. This could lead to more adaptive, efficient, and perhaps even more 'intelligent' agents. I can totally see this inspiring future work in dynamic routing for transformers, or even agentic systems where an agent's internal reasoning pathways can optimally reconfigure based on the task at hand. It's about building foundational tools that make our AI models not just bigger, but fundamentally smarter and more elegant.
Read the paper
The First "Bug" Did you know the term "debugging" literally comes from a bug? In 1947, computing pioneer Grace Hopper and her team at Harvard were baffled by an issue with the Mark II computer. They traced it to a moth trapped in a relay, which she famously taped into her logbook with the note: "First actual case of bug being found." What started as a literal pest became a ubiquitous term for fixing software glitches!
Octopus Distributed Intelligence Imagine a computer where most of the processing power isn't in the main CPU, but distributed across its peripherals. That's essentially an octopus! Around two-thirds of an octopus's neurons are located in its eight arms, allowing each arm to "think" and act semi-autonomously, even tasting and manipulating objects independently of its central brain. It's a natural marvel of distributed computing, inspiring researchers at places like the Okinawa Institute of Science and Technology (OIST) studying decentralized control systems.
Slime Mold Solves Problems Meet Physarum polycephalum, a single-celled slime mold that acts like a living computer. Researchers, including the fascinating Toshiyuki Nakagaki at Hokkaido University, have shown this organism can solve mazes and even optimize transportation networks better than human-designed systems, like Tokyo's railway. By growing towards food sources and retracting from inefficient paths, it finds the shortest and most robust connections, offering incredible bio-inspired insights for algorithm design.
Okay, Francesco, check out this paper I stumbled upon β "LEO: Graph Attention Network based Hybrid Multi Sensor Extended Object Fusion and Tracking for Autonomous Driving Applications." It's published by Mayank Mayank, Bharanidhar Duraisamy, and Florian Geiss, out of Mercedes-Benz, so you know it's tackling real-world production challenges.
What immediately grabbed me is how it dives straight into one of the coolest ongoing debates in ML for critical systems: how do you blend the theoretical robustness and efficiency of classical Bayesian methods with the adaptability of deep learning? Classical models are great if you know all your priors and likelihoods, but real-world data is messy! Deep learning adapts, but needs tons of labels and compute. LEO, their "Learned Extension of Objects," tries to get the best of both. It uses a Graph Attention Network (GAT) β not the generative kind, but Graph Attention β to intelligently fuse data from multiple production-grade sensors. Imagine a self-driving car trying to track not just a point representing a truck, but its actual shape and how it articulates. LEO uses GATs to learn which sensor signals to trust more, maintain temporal consistency, and adaptively represent these complex, multi-scale shapes, like an articulated truck and its trailer. They even came up with a "parallelogram ground-truth" to capture these trickier geometries!
This hybrid thinking is super relevant, connecting directly to the infrastructure and robustness work by people like Ana Klimovic at ETH ZΓΌrich's EASL lab, who often looks at efficient, production-ready ML systems. She did her PhD at Stanford under Christos Kozyrakis and Matei Zaharia, both giants in systems and scalable ML, so the "real-time computational efficiency suitable for production systems" aspect of LEO would totally resonate with their research groups. Itβs all about creating intelligent systems that are not just smart, but also dependable and performant in the wild.
Thinking bigger picture, this isn't just about cars. Any intelligent agent, be it in robotics or even web navigation (like what Shuyan Zhou and Jing Yu Koh are doing with WebArena/VisualWebArena at CMU, where agents need to interpret and interact with complex web layouts), needs robust "perception" of dynamic, extended "objects" in its environment. How do you track a user's evolving intent, or a complex UI element, across a session?
Hereβs a wild thought for your thesis: LEO uses parallelograms for complex shapes. What if you explored learning dynamic, implicit shape representations for extended objects, perhaps combining GNNs with ideas from neural implicit fields (like a lightweight NeRF, but for object outlines and dynamics) to handle highly irregular or deforming objects? How could an agent use such a rich shape understanding to better predict interactions or even anticipate changes in a web environment?
Read the paper
Stay curious.