Announcing the Arcee Model Engine Public Beta
Get direct access to the small language models (SLMs) that power Arcee Orchestra, our new end-to-end, SLM-powered agentic AI platform. Sign up for the public beta of the Arcee Model Engine today.

We're thrilled to announce that the Arcee Model Engine, our hosted inference service, is now available in public beta at models.arcee.ai. Designed to support our agentic orchestration system—Arcee Orchestra—the Model Engine brings you direct access to Arcee's suite of small Language Models (SLMs), featuring some of the most capable models we've built to date.
With the Arcee Model Engine, you can integrate these models directly into your workflows or use them as standalone systems. Whether you're automating processes, driving complex reasoning, or building intelligent applications, the Model Engine is here to provide the performance and flexibility you need.
Models Available on the Arcee Model Engine
The Arcee Model Engine hosts the following SLMs, each tuned for specific capabilities:
Virtuoso (Large, Medium, Small)
General-purpose, high-performance models:
- Virtuoso Large: Best-in-class frontier model.
- Virtuoso Medium: Mid-tier general-purpose performance at a lower cost. (Coming Soon)
- Virtuoso Small: Optimized for lightweight tasks and faster inference. (Coming Soon)
- We're also releasing this model under the Apache-2.0 license on HuggingFace.
Caller
A cutting-edge 32B function-calling model designed for executing workflows and interacting with external systems.
Coder
Purpose-built models for developers:
- Coder: Handles advanced programming and development tasks.
- Coder Small: Lightweight option for faster, simpler coding workflows and autocomplete tasks. (Coming Soon)
Spotlight
A wickedly fast vision-language model optimized for interpreting and generating multimodal outputs.
Maestro
Our most advanced reasoning model, ideal for tackling logical, mathematical, and analytical challenges. Excels at structured reasoning and step-by-step problem-solving. (Coming Soon)

Model Costs
Arcee Model Engine offers flexible, cost-efficient pricing for all models, with a $20/month minimum. The rates per token are as follows:
Model | Input Cost (1/M Tokens) | Output Cost (1/M Tokens) |
---|---|---|
Virtuoso Large | $1.27 | $1.50 |
Virtuoso Medium | $0.67 | $0.82 |
Virtuoso Small | $0.40 | $0.52 |
Coder | $0.67 | $0.82 |
Coder Small | $0.40 | $0.52 |
Caller | $0.67 | $0.82 |
Maestro | $1.59 | $1.88 |
Spotlight | $0.29 | $0.40 |
How to Get Started
Installing and using Arcee models is straightforward. All models are compatible with OpenAI's inference format, so integrating them into your applications is as simple as setting a few environment variables:
# Initialize the OpenAI client
openai.api_base = "https://models.apps.arcee.ai/v1"
openai.api_key = "YOUR_ARCEE_API_KEY"
client = OpenAI()
from openai import OpenAI
completion = client.chat.completions.create(
model="virtuoso-large",
messages=[
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "Hello!"}
],
stream=True
)
for chunk in completion:
print(chunk.choices[0].delta)
For function calling, here's an example:
from openai import OpenAI
# Initialize the OpenAI client
openai.api_base = "https://models.apps.arcee.ai/v1"
openai.api_key = "YOUR_ARCEE_API_KEY"
client = OpenAI()
tools = [
{
"type": "function",
"function": {
"name": "get_current_weather",
"description": "Get the current weather in a given location",
"parameters": {
"type": "object",
"properties": {
"location": {
"type": "string",
"description": "The city and state, e.g. San Francisco, CA",
},
"unit": {"type": "string", "enum": ["celsius", "fahrenheit"]},
},
"required": ["location"],
},
}
}
]
messages = [{"role": "user", "content": "What's the weather like in Boston today?"}]
completion = client.chat.completions.create(
model="caller",
messages=messages,
tools=tools,
tool_choice="auto"
)
print(completion)
For vision-language tasks, try this:
from openai import OpenAI
# Initialize the OpenAI client
openai.api_base = "https://models.apps.arcee.ai/v1"
openai.api_key = "YOUR_ARCEE_API_KEY"
client = OpenAI()
response = client.chat.completions.create(
model="spotlight",
messages=[
{
"role": "user",
"content": [
{"type": "text", "text": "What's in this image?"},
{
"type": "image_url",
"image_url": {
"url": "https://upload.wikimedia.org/wikipedia/commons/thumb/d/dd/Gfp-wisconsin-madison-the-nature-boardwalk.jpg/2560px-Gfp-wisconsin-madison-the-nature-boardwalk.jpg",
}
},
],
}
],
max_tokens=300,
)
print(response.choices[0])
Get Started Today With Arcee AI's Model Engine
The Arcee Model Engine is a powerful, scalable inference solution. Whether you're creating advanced automations with Arcee Orchestra or integrating standalone SLMs into your projects, the flexibility and performance of our hosted models ensure you have everything you need to succeed.
Visit models.arcee.ai and start building today.