Generative AI improvements are increasingly being made through data curation and collection — not architectural — improvements. Big Tech has an advantage.
… and that push has been obvious since before GPT-4 blew up, thanks to Google themselves. AlphaGo was quickly surpassed by AlphaGo Zero, which was surpassed by AlphaZero, which was surpassed by MuZero. Each one was an order of magnitude smaller than the last. Each one did more, sooner, despite less input.
A big part of this AI boom has been randos tooling around on a single consumer GPU. Outside of that, I understand there’s ways to rent compute time remotely, down to mundane individual budgets.
Meanwhile: big iron tells people to put glue on their pizza, based on exactly one reddit comment. Money is not a cure-all we’d like alternatives to. Money just amplifies whatever approach they’ve fixated on. It’s a depth-first search, opposite the breadth-first clusterfuck of everyone else doing their own thing.
I would bet good money on locality becoming a huge focus, once someone less depressed than me bothers to try it properly. Video especially doesn’t need every damn pixel shoved through the network in parallel. All these generators with hard limits on resolution or scene length could probably work with a fisheye view of one spot at a time. (They would have solved the six-finger problem much sooner, even if it took longer to ensure only two hands.) If that approach is not as good, conceptually - it’s a lot narrower, so you could train the bejeezus out of it. We would not need another decade to find out if I’m just plain wrong.
… and that push has been obvious since before GPT-4 blew up, thanks to Google themselves. AlphaGo was quickly surpassed by AlphaGo Zero, which was surpassed by AlphaZero, which was surpassed by MuZero. Each one was an order of magnitude smaller than the last. Each one did more, sooner, despite less input.
A big part of this AI boom has been randos tooling around on a single consumer GPU. Outside of that, I understand there’s ways to rent compute time remotely, down to mundane individual budgets.
Meanwhile: big iron tells people to put glue on their pizza, based on exactly one reddit comment. Money is not a cure-all we’d like alternatives to. Money just amplifies whatever approach they’ve fixated on. It’s a depth-first search, opposite the breadth-first clusterfuck of everyone else doing their own thing.
I would bet good money on locality becoming a huge focus, once someone less depressed than me bothers to try it properly. Video especially doesn’t need every damn pixel shoved through the network in parallel. All these generators with hard limits on resolution or scene length could probably work with a fisheye view of one spot at a time. (They would have solved the six-finger problem much sooner, even if it took longer to ensure only two hands.) If that approach is not as good, conceptually - it’s a lot narrower, so you could train the bejeezus out of it. We would not need another decade to find out if I’m just plain wrong.
Hey, bud. DM me if you need someone to talk to.