I'm curious about the granularity of contracts around granting/selling excess capacity. Are they short term? Can the owner evict those workloads (with a penalty)?
But our utilisation measurements are from waste within a users allocation. It’s waste of what users are actually requesting and running, not from any reserved idle capacity.
For now we sit only on the prediction/intelligence layer; we don’t do any scheduling. We don’t grant or sell capacity, we just tell the scheduler (and user) what a job actually needs.
Presumably the underlying model here is also an LLM? To what degree is it "fine-tuned", or is it just given a set of tools to build a good picture of cluster usage?
This is also why fine-tuning matters for us. We train a cluster-specific model that gets better as more jobs run on your cluster, because the same code behaves differently on different topology. An LLM reasons about code/script in a vacuum with no native sense of how your nodes actually perform
https://www.linkedin.com/posts/rahmi-pruitt-a1bb4a127_agentn...
I wonder what is stopping datacenters from passing this benefit to customers by launching better tuned plans. For example, t series EC2 instances on AWS.
I feel like it’s probably just complexity.
Different workloads benefit from specific types of optimisations.