To make the most of your techUK website experience, please login or register for your free account here.

21 Sep 2023

by Meryem Arik

The concern around GPU shortages and how these could impact the AI revolution

Guest blog from Meryem Arik, CEO and Co-Founder at TitanML. Part of techUK's #SuperchargeUKTech Week 2023.

The AI revolution gained significant momentum with OpenAI's release of ChatGPT in November of last year. While it's evident that AI has the potential to profoundly transform various aspects of our lives, a significant obstacle currently hampers its progress - the availability of computational resources, particularly cutting-edge GPUs.

So, what are GPUs, and why are they crucial?

AI fundamentally involves solving complex mathematical problems, often on an enormous scale. Just as a calculator is necessary to solve mathematical problems, AI relies on powerful computational resources, commonly known as compute. Without sufficient compute, AI cannot thrive.

While various types of computational resources can be used for AI, GPUs (Graphical Processing Units) dominate for tasks requiring substantial computing power. Businesses running intensive AI models, such as language models, or those with low-latency requirements, necessitate GPU-based inferencing.

How Is GPU Demand Evolving?

Pre-training Large Language Models (LLMs):

The training of Large Language Models (LLMs) is renowned for its intensive compute demands. For instance, training GPT-3, the foundational model behind ChatGPT, consumed an estimated 1,287 Gigawatt hours of electricity, equivalent to the annual consumption of 120 US homes.

This demanding task relies on extensive GPU clusters, and discussions on GPU demand frequently centre on this aspect. However, this high GPU demand primarily pertains to the training phase, which occurs only sporadically in a few companies. Once trained, LLMs can be utilised across myriad applications, effectively distributing the training cost among numerous users. Therefore, the per-business GPU requirement becomes a relatively small fraction.

Commercial Fine-tuning and Inferencing LLMs:

The most significant growth in GPU demand is occurring in the commercial training and inferencing of LLMs. With the advent of highly capable AI and mature LLMs, businesses across the spectrum are eager to integrate AI applications. This trend is evident in the rapid proliferation of OpenAI-compatible solutions following the release of ChatGPT.

In the envisioned future, our interaction with LLMs will become ubiquitous, ranging from predictive text to auto-transcription. Meeting this level of adoption will demand an immense compute capacity.

This surging demand is already straining resources. OpenAI's premium version of ChatGPT, which guarantees consistent uptime, has experienced intermittent unavailability due to overwhelming demand and, presumably, insufficient compute resources. If this is occurring at this early stage of the AI evolution, one can only imagine the challenges in the months and years ahead as usage continues to soar.

What will be the impact?

The exponential growth in GPU demand is far outpacing supply, leading to widespread GPU shortages. This presents two major issues:

Exclusivity of AI: Insufficient supply often leads to substantial price hikes, restricting AI adoption to high-value use cases where benefits significantly outweigh costs. While this isn't inherently negative, it can stifle innovation. Furthermore, it concentrates AI's benefits in the hands of the wealthiest corporations, exacerbating the power imbalance in the AI landscape.
Reduced Efficiency: The consequences of this shortage are already visible, with models and requests exceeding the hardware capacity allocated to services, resulting in slower performance and increased costs. These inefficiencies have a cascading effect on AI applications, making them prone to glitches and slowdowns.

Neither of these outcomes aligns with the desired future of AI.

What can be done?

Fortunately, numerous strategies can mitigate our reliance on costly GPUs:

Select Appropriate Models: While powerful AI models like GPT-4 have their place, many use cases can achieve comparable or superior performance with smaller, resource-efficient models fine-tuned on high-quality data.
Model Compression and Hardware Optimization: Although these techniques are often confined to research labs, TitanML, through its Takeoff Inference Server, is democratising AI and machine learning deployment. This server enables companies to use more affordable GPUs, with some clients reporting over 90% reductions in compute costs and 2000% latency improvements within hours of deployment. TitanML has also achieved real-time deployment of state-of-the-art Falcon LLM on commodity CPUs, a feat recognised by the industry, offering customers an even wider range of solutions.

Conclusion

Over-reliance on scarce GPUs remains a pressing issue, and it may worsen before showing signs of improvement. Nonetheless, a wealth of best practices can reduce compute consumption when deploying AI, improving latencies, and reducing costs. Addressing this challenge is pivotal to realizing the full potential of the AI revolution, and it's a mission we are committed to at TitanML.

For more details about TitanML, please visit: titanml.co

Supercharging Innovation Week 2023

techUK members explored the emerging and transformative technologies at the heart of UK research and innovation. This week was designed to investigate how to leverage the UK's strengths and push forward the application and commercialisation of these technologies, highlighting best practice from academia, industry and Government that is enabling success. You can catch up via the link below.

Find out more

techUK – Unleashing UK Tech and Innovation

The UK is home to emerging technologies that have the power to revolutionise entire industries. From quantum to semiconductors; from gaming to the New Space Economy, they all have the unique opportunity to help prepare for what comes next.

techUK members lead the development of these technologies. Together we are working with Government and other stakeholders to address tech innovation priorities and build an innovation ecosystem that will benefit people, society, economy and the planet - and unleash the UK as a global leader in tech and innovation.

For more information, or to get in touch, please visit our Innovation Hub and click ‘contact us’.

Upcoming events

Latest news and insights

08 Apr 2025

Meet the Innovators: Deborah & Aleyne at Samsung

Find out more about Samsung and their leadership within the UK's innovation ecosystem.

Industry views

19 Mar 2025

Meet the Innovators: Spencer Lamb, Chief Commercial Officer at Kao Data

Discover more about Kao Data and how they're driving innovation.

Industry views

25 Feb 2025

Event round-up: Jobs, Automation & the Robotics Revolution

Read and watch a summary of techUK’s event exploring how robotics and automation will shape jobs in the future.

Event round-up

20 Feb 2025

Institutions of Innovation: Defence Science and Technology Laboratory

Find out more about how the Defence Science and Technology Laboratory is leading on enabling innovation.

techUK news and views

14 Feb 2025

Contribute to techUK's Productivity & Automation insight series!

Members and stakeholders can submit blogs or vlogs exploring the future of productivity & automation.

Opportunities

28 Jan 2025

Institutions of Innovation: British Business Bank

Find out more about how the British Business Bank is leading on enabling innovation.

techUK news and views

19 Dec 2024

Participate in techUK's upcoming Robotics & Automation sprint campaign

Opportunities include speaking, writing, partnering and shaping policy recommendations.

techUK news and views

Gaming & Esports Technologies of the Future 380x215px.png

20 Nov 2024

techUK's first Gaming and Esports report is now live!

This features 8 future tech trends, case studies, and recommendations for Government to make future leadership in gaming and Esports technologies a reality.

techUK reports

Event round-ups

26 Sep 2023

Event round-up – Industries of the Future: Space

This webinar explored the space industry of tomorrow and panellists spoke about what steps the UK can take to become a superpower across a broad spectrum of emerging and transformative space technologies.

Event round-up

31 Oct 2023

Event round-up - Supercharging Investment in Space

This event explored how the UK can improve access to growth capital for space SMEs looking to scale.

Event round-up

12 Dec 2023

Event round-up - Lessons from Yorkshire: How Space Clusters are supporting the UK to become a tech superpower

Click here to view the event recording, speaker slides, and written summary.

Event round-up

Report

Insights

06 Sep 2023

How emerging space technologies are transforming agriculture on Earth

Hear from Agri-TechE about how space technologies are supercharging innovation in agriculture.

Industry views

24 Oct 2023

Meet the Innovators: Oliver Lanestead, Head of Systems Engineering, Development & Test at Reaction Engines (advanced propulsion)

In our first video interview of the series, learn about Reaction Engines and how they're leading on innovation.

Industry views

Mark Garnier space investment blog card 1.png

27 Oct 2023

How the Treasury can boost space SMEs and make the UK a Space Finance Hub

Published in advance of techUK's Supercharging Investment in Space event.

Industry views

03 Nov 2023

The CAA overhauls its Space Regulation website

The changes are a welcome boost for UK companies seeking a space license.

techUK news and views

Flanders Investment and Trade insight card.png

17 Nov 2023

Moving beyond the Clusters: How UK space companies can make the most of international collaboration

A guest insight published in advance of our "Lessons from Yorkshire" space clusters webinar on 5th December.

Industry views

28 Feb 2024

Institutions of Innovation: Civil Aviation Authority

Learn about how the CAA is at the forefront of driving UK innovation through space regulation.

techUK news and views

Get in touch

Rory Daniels

Senior Programme Manager, Emerging Technologies

Gaming & Esports

Running from January to May 2024, this sprint campaign explored how the UK can lead on the development, application and commercialisation of the technologies set to underpin the Gaming & Esports sector of the future.

These include AI, augmented / virtual / mixed / extended reality, haptics, cloud & edge computing, semiconductors, and advanced connectivity (5/6G).

Activity took the form of roundtables, panel discussions, networking sessions, Summits, and thought leadership pieces. A report featuring member case studies and policy recommendations was launched at The National Videogame Museum in November 2024.

Get in touch below to find out more about techUK's future plans in this space.

Report

20 Nov 2024

techUK's first Gaming and Esports report is now live!

This features 8 future tech trends, case studies, and recommendations for Government to make future leadership in gaming and Esports technologies a reality.

techUK reports