Home
Blog
New AI inference models on Application Catalog: translation, agents, and flagship reasoning

New AI inference models on Application Catalog: translation, agents, and flagship reasoning

December 7, 2025

2 min read

New AI inference models on Application Catalog: translation, agents, and flagship reasoning

We’ve expanded our AI inference Application Catalog with three new state-of-the-art models, covering massively multilingual translation, efficient agentic workflows, and high-end reasoning. All models are live today via Everywhere Inference and Everywhere AI, and are ready to deploy in just 3 clicks with zero infrastructure management.

We update our Application Catalog whenever promising new AI models are launched so that Gcore AI customers can deploy them with ease in just 3 clicks. It’s part of our commitment to making AI simple and easy to use, without compromising on performance.

Let’s see what these new models offer.

Meta/Facebook SeamlessM4T v2 Large

Meta's Facebook’s SeamlessM4T v2 Large is a massively multilingual, multimodal translation model designed to break down communication barriers across nearly 100 languages. This model is a unified system, meaning it can handle complex, real-world translation needs without relying on separate models for speech recognition (ASR) and text-to-speech (TTS).

Key capabilities:

Supports all major translation directions: speech-to-speech, speech-to-text, text-to-speech, and text-to-text.
Global: Seamlessly translates speech and text in approximately 100 languages.
Ideal for: Real-time voice translation services, powering multilingual contact centers, and creating sophisticated global AI assistants that can interact naturally across different linguistic modes.

Deploy Meta SeamlessM4T v2 Large

MiniMax M2

The MiniMax M2 is a powerful addition for developers focusing on agentic workflows and coding tasks. Utilizing a Mixture-of-Experts (MoE) architecture, this LLM offers 230 billion total parameters while activating only about 10 billion at any given time. This selective activation makes it exceptionally fast and cost-efficient.

Key capabilities:

Delivers near-flagship performance quality without the high latency and cost typically associated with models of this size.
Highly optimized for complex agentic use cases, enabling it to excel at planning, tool-use, and sophisticated multi-step workflows.
Building efficient tool-using agents, acting as a high-speed dev copilot, and automating multi-step workflow pipelines.

Deploy MiniMax M2

Qwen3-235B-A22B-Instruct-2507

Qwen’s latest flagship model is built for the most sophisticated enterprise and research applications. This large MoE model (235 billion total parameters with 22 billion active) is instruction-tuned and multilingual, providing world-class performance across challenging benchmarks. Its standout feature is its massive context window, accommodating extremely long inputs.

Key capabilities:

High-end reasoning: Excels in complex areas like reasoning, advanced mathematics, and sophisticated code generation.
Massive context: Supports context up to 262K tokens, allowing the model to manage and analyze huge documents, codebases, or extended conversation histories.
Ideal for: Creating sophisticated enterprise copilots for research and analysis, powering highly complex agents that require deep context memory, and advanced general-purpose text generation.

Deploy Qwen3-235B-A22B-Instruct-2507

Deploy the latest models in 3 clicks and 10 seconds

With Gcore AI inference solutions, you can eliminate the operational complexity of AI without compromising performance or power. Experience low-latency routing, transparent and efficient costs, and 3-click deployment. Simply open the Gcore Customer Portal, choose your model from the Application Catalog, and launch your endpoint.

Deploy these new AI models today

Mili Leitner Cohen

Content Marketing Lead, AI Products

Introducing faster, lower-cost LLM inference with NVIDIA Dynamo

Imagine if you could click a button and suddenly your GPUs increase their throughput by 6x. Or reduce latency by 2x. Or route inference requests seamlessly across different GPU types.That's the experience we're bringing to our inference cus

New AI inference models available now on Gcore

We’ve expanded our Application Catalog with a new set of high-performance models across embeddings, text-to-speech, multimodal LLMs, and safety. All models are live today via Everywhere Inference and Everywhere AI, and are ready to deploy i

Introducing Gcore Everywhere AI: 3-click AI training and inference for any environment

For enterprises, telcos, and CSPs, AI adoption sounds promising…until you start measuring impact. Most projects stall or even fail before ROI starts to appear. ML engineers lose momentum setting up clusters. Infrastructure teams battle to b

Introducing AI Cloud Stack: turning GPU clusters into revenue-generating AI clouds

Enterprises and cloud providers face major roadblocks when trying to deploy GPU infrastructure at scale: long time-to-market, operational inefficiencies, and difficulty bringing new capacity to market profitably. Establishing AI environment

Edge AI is your next competitive advantage: highlights from Seva Vayner’s webinar

Edge AI isn’t just a technical milestone. It’s a strategic lever for businesses aiming to gain a competitive advantage with AI.As AI deployments grow more complex and more global, central cloud infrastructure is hitting real-world limits: c

From budget strain to AI gain: Watch how studios are building smarter with AI

Game development is in a pressure cooker. Budgets are ballooning, infrastructure and labor costs are rising, and players expect more complexity and polish with every release. All studios, from the major AAAs to smaller indies, are feeling t

New AI inference models on Application Catalog: translation, agents, and flagship reasoning

Meta/Facebook SeamlessM4T v2 Large

MiniMax M2

Qwen3-235B-A22B-Instruct-2507

Deploy the latest models in 3 clicks and 10 seconds

Related articles

Subscribe to our newsletter