The global most powerful information hub of high performance & advanced materials, innovative technologies

to market your brand and access to the global demand and supply markets

Amazon launches AWS Inferentia chip for AI deployment

January 15, 2024

Amazon announced the launch of Inferentia, a chip designed by AWS specifically for deploying large AI models with GPUs, which will be launched next year.

Inferentia will work with major frameworks such as TensorFlow and PyTorch, and is compatible with EC2 instance types and Amazon's machine learning service SageMaker.

"You will be able to get hundreds of TOPS on each chip; if you want, you can bundle them together to get thousands of TOPS," AWS CEO Andy Jassy said at the annual re: Invent conference today. .

Inferentia will also partner with Elastic Inference, a method to accelerate the deployment of AI using GPU chips, which was also announced today.

Elastic reasoning works for data ranges from 1 to 32 teraflops. Inferentia detects when the main framework is used with EC2 instances, and then looks at which parts of the neural network will benefit the most from acceleration; then moves those parts to elastic reasoning to improve efficiency.

Jassy said that the two main processes required to launch an AI model today are training and inference, which account for nearly 90% of the cost.

"We think that operating costs can be saved by 75% through Elastic Inference. If you put Inferentia on it, this is another 10-fold increase in cost, so this is a major game changer. These two launches infer Our customers, "he said.

Inferentia's release follows a chip that debuted on Monday and was used exclusively by AWS to perform common workflows.

The debut of Inferentia and Elastic Inference is one of several AI-related announcements released today. Today also announced: the launch of the AWS market for developers to sell their AI models, as well as the launch of the DeepRacer League and AWS DeepRacer cars that run on AI models trained using reinforcement learning in a simulated environment.

Today's preview also offers many services that do not require prior knowledge of how to build or train AI models, including Textract for extracting text from documents, Personalize for customer suggestions, and Amazon Forecast, a service that generates private prediction models .

The global most powerful information hub of high performance & advanced materials, innovative technologies

Amazon launches AWS Inferentia chip for AI deployment

Please check the message before sending