This video with a procurement manager from a large company was from a few months ago. At the time she planned for 6 weeks, now she is planning for 6 months but the outcome is still surprising. Note that she is not buying semiconductors, but rather planning when she can buy the GPUs that her team needs to run a model that the company promised to deliver to a customer several months ago. She said that the project had slipped and that the hardware just arrived 3 weeks late then it would have mattered.
The slippage in projects planning to utilize AI and the subsequent anger of clients is becoming the norm as hardware does not arrive on time.
Who actually gets the hardware?
A major reason for this framing is that the biggest spenders on AI infrastructure are the hyperscalers. Their annual spend on AI infrastructure is in the tens of billions of dollars. This money goes directly to the manufacturers as a priority customer. In some cases, a single purchase agreement from a hyperscaler can comprise an entire fiscal quarter of spend from a manufacturer.
Who actually gets the hardware that they need to finish building their specialized AI models for their customers? As I said before, the large hyperscalers can place orders of incomprehensible scale and will get priority allocation from the semiconductor manufacturers. The mid market company with specialized AI applications can only hope that they get the GPUs that they need to finish building a model for a customer on time. The researcher can only hope that he or she gets the allocation that they need to complete their research.
The secondary market for GPUs (nvidia components) is a chaotic wild price swingers market right now and it seems as though the wildly unpredictable availability of new GPUs for purchase is the main cause of the wild price fluctations. So it is futile for a company trying to build out AI infrastructure to attempt to develop out a purchasing strategy to purchase GPUs. It is best to purchase nvidia components from a hardware distributor with a proven track record for providing the lowest total cost of ownership of the highest quality of hardware. The biggest risk when purchasing GPUs from the secondary market (or grey market) is that an expensive failure occurs and the person or company responsible for having purchased the GPUs will be stuck trying to explain to their CEO and/or to their investors as to why an attempt to complete an AI deployment using said GPUs failed.
Unprecedented demand for AI hardware
Large language models, real-time recommendation systems, and generation tools that are active within large enterprise applications all require significant amounts of compute. As a result, the normal servers that companies have traditionally used for older workloads are no longer sufficient. Companies are trying to purchase as much hardware as possible to train their models and run their models at scale.
Instead, companies are powering large language models, running high volume inference, real time recommendation systems, and generating tools within enterprise software. In order to run these workloads, companies are shifting from being able to run on modest server clusters to needing GPU clusters to run their workloads at all. And they need these clusters of GPUs as quickly as possible, not in 6-12 months when the market may have stabilized for the time being.
Once you have decided that your organization needs to implement some form of AI-based processing and have concluded that the hardware must be procured, the next item to consider is the basic procurement process. I say “basic” because, increasingly, organizations are realizing that the existing procurement process is insufficient for a number of reasons. That is, the existing process for procurement of ordinary IT gear does not work well for AI processing hardware for a number of reasons. In many organizations, the process for acquiring such systems has become a competition to see who can get the needed gear the fastest. That is to say that the normal process of procurement, i.e. evaluation of alternative solutions, selection of the best solution, award to vendor, implementation, is not well-suited to the current environment.
Why supply can’t just catch up
Semiconductor chips, particularly leading edge ones used in high-performance computing to power AI workloads, take multi-years to design, develop, test, and qualify before mass production can begin. Semiconductors are produced in fab (fabrication) facilities around the world in very limited numbers and semiconductor production is not something that can be ramped up quickly to service surging demand.
It’s like discovering your city has a water shortage and deciding the solution is to build a new reservoir. Technically correct. Completely unhelpful in the next eighteen months.
A second point worth mentioning is that the GPU and AI server (or node) is comprised of many individual components. Each of these individual components has its own supply chain. The packaging, the high bandwidth memory, the advanced cooling solutions (such as liquid cooling), and the power delivery solutions all are typically co-manufactured with the leading edge semiconductor fabrication (such as the foundry that makes the GPU die). Each of these components also has its own constraints and limits. Thus, for example, a shortage of high quality packaging materials could cause delays in production of certain types of GPU-based servers. Similarly, a shortage of high bandwidth memory could cause delays in production of other types of servers. As a result, a shortage in one individual component can cause delays in the production of other individual components that are required to complete the overall system (i.e. the server or node).
(There’s a very ironic connection here. The companies that are racing to develop AI tools, could be using these very tools to optimize their supply chains in order to deliver the hardware needed to develop these very AI tools. For now though, that is not the case.)
What smarter procurement actually looks like
This post is aimed at being somewhat helpful to smarter procurement of nvidia components, although the author failed to create a useful framework to make it more solid. So this is basically just an overview of possible ways of approaching AI hardware purchasing, followed by notes on how possible strategies might be utilized by different size of organizations.
- Planning horizons have lengthened dramatically. Twelve to eighteen months of forward visibility is now a reasonable target, not an overreach.
- Vendor relationships matter more than ever. Companies with established distributor relationships are getting access that purely transactional buyers simply aren’t seeing.
- Flexibility in specs actually helps. Teams that can adapt to available configurations instead of waiting indefinitely for a specific SKU are moving faster and missing fewer windows.
- Budget buffers are non-negotiable. Prices at the component level are not stable. Planning as if they are is optimistic at best, catastrophic at worst.
A quick look at how different procurement approaches tend to shake out:
| Approach | Lead time expectation | Risk level | Typical outcome |
| Reactive, spot buying | Immediate need | High | Delays, inflated costs, grey market exposure |
| Planned allocation with distributor | 6-12 months out | Medium | More predictable delivery, better pricing |
| Long-term partner agreements | 12-18+ months | Lower | Priority access, cost stability, relationship leverage |
The uncomfortable truth about timing
The problems facing supply of necessary hardware for running AI programs are not short term in nature and demand for AI compute infrastructure is currently rising at an exponential rate. The necessary hardware for many current AI programs is in the process of a multi-year build-out and in all likelihood this will not be smooth and gradual. New hardware will be required as large language models are trained, and currently a growing area of use of AI compute is inference at scale which will also need ever increasing amounts of hardware. It is unlikely that the supply of necessary hardware will grow at a pace that is greater than currently anticipated by the infrastructure roadmaps of current vendors of high performance computing hardware for AI.
Back to the procurement manager and her ‘ invisible purchases’. She was explaining to me how she is trying to get hold of sufficient hardware for next year’s projects. As we spoke she was in the process of trying to get agreement from a supplier as to how many items she could purchase. She explained that it was a bit strange in that she couldn’t see the items and that she had no clear idea as to the actual projects for which they would be used. However that was precisely her situation. And she was doing just fine in terms of ensuring that her team kept to their deployment schedule. Indeed she said that her team had not missed a deployment in the last two quarters. Yes she had had her problems with the supply chain but she had managed to keep on top of them and was doing just fine.
Getting ahead of a problem is typically the best way to cease chasing it. This is obvious, but I see many teams struggling with this AI hardware procurement problem, only to end up running around at the end of the fiscal quarter to call up various vendors to ask for allocation for a deployment at the start of the next fiscal quarter. There is a far greater opportunity for these teams in the current environment.
