Azure AI Inference Provider

Connect to Azure AI Inference and GitHub Models for serverless access to a wide catalog of models.

Quick Start

builder.Services
    .AddCoreAIServices()
    .AddCoreAIOrchestration()
    .AddCoreAIAzureAIInference();

Services Registered

Service	Implementation	Lifetime
`IAIClientProvider`	`AzureAIInferenceClientProvider`	Scoped
`IAICompletionClient`	`AzureAIInferenceCompletionClient`	Scoped
Connection source	—	Scoped

Configuration

For GitHub Models

{
  "CrestApps": {
    "AI": {
        "Connections": [
          {
            "Name": "github-models",
            "ClientName": "AzureAIInference",
            "Endpoint": "https://models.inference.ai.azure.com",
            "ApiKey": "your-github-token"
          }
        ],
        "Deployments": [
          {
            "Name": "gpt-4o-mini",
            "ClientName": "AzureAIInference",
            "ConnectionName": "github-models",
            "Type": "Chat"
          }
        ]
      }
    }
}

For Azure AI Studio

{
  "CrestApps": {
    "AI": {
        "Connections": [
          {
            "Name": "azure-ai-inference",
            "ClientName": "AzureAIInference",
            "Endpoint": "https://your-project.inference.ai.azure.com",
            "ApiKey": "your-api-key"
          }
        ]
      }
    }
}

Constants

Constant	Value
`AzureAIInferenceConstants.ClientName`	`"AzureAIInference"`

Use Cases

GitHub Models — Access models like GPT-4o, Llama, Mistral via GitHub token
Azure AI Studio — Serverless model deployments without managing infrastructure
Model Catalog — Access a wide variety of models through a single provider

Capabilities

Capability	Supported
Chat completions	✅
Streaming	✅
Embeddings	✅
Image generation	❌
Speech-to-text	❌
Text-to-speech	❌

GitHub Models

The Azure AI Inference provider is the gateway to the GitHub Models marketplace, which lets you experiment with and deploy models from multiple vendors through a single API:

Get a GitHub token — Generate a personal access token (PAT) at github.com/settings/tokens with the models:read scope.
Point to the GitHub Models endpoint — Use https://models.inference.ai.azure.com as the endpoint.
Select a model — Use the model name from GitHub Marketplace as the deployment name.

{
  "CrestApps": {
    "AI": {
        "Connections": [
          {
            "Name": "github-models",
            "ClientName": "AzureAIInference",
            "Endpoint": "https://models.inference.ai.azure.com",
            "ApiKey": "ghp_your-github-token"
          }
        ]
      }
    }
}

tip

GitHub Models is free for experimentation with rate limits. For production workloads, deploy the same models through Azure AI Studio for higher throughput and SLA guarantees.

Configuration

Programmatic Registration

builder.Services.AddCoreAIConnectionSource("AzureAIInference", options =>
{
    options.Connections.Add(new AIProviderConnectionEntry
    {
        Name = "github-models",
        ClientName = "AzureAIInference",
        // Endpoint and API key loaded from configuration
    });
});

Environment-Specific Configuration

Use different endpoints for development vs. production:

appsettings.Development.json
{
  "CrestApps": {
    "AI": {
        "Connections": [
          {
            "Name": "github-models",
            "ClientName": "AzureAIInference",
            "Endpoint": "https://models.inference.ai.azure.com",
            "ApiKey": "ghp_your-github-token"
          }
        ]
      }
    }
}

appsettings.Production.json
{
  "CrestApps": {
    "AI": {
        "Connections": [
          {
            "Name": "production-models",
            "ClientName": "AzureAIInference",
            "Endpoint": "https://your-project.inference.ai.azure.com",
            "ApiKey": "your-azure-key"
          }
        ]
      }
    }
}

Available Models

The Azure AI Inference / GitHub Models endpoint provides access to a broad catalog of models from multiple vendors:

Vendor	Example Models	Strengths
OpenAI	GPT-4o, GPT-4o-mini	General purpose, function calling, vision
Meta	Llama 3.2, Llama 3.1	Open-weight, strong reasoning
Mistral	Mistral Large, Mistral Small	Multilingual, efficient
Cohere	Command R+, Command R	RAG-optimized, multilingual
AI21	Jamba 1.5	Long context, document processing

info

Model availability varies between GitHub Models (free tier) and Azure AI Studio (production tier). Check the respective marketplaces for the current catalog.

Quick Start​

Services Registered​

Configuration​

For GitHub Models​

For Azure AI Studio​

Constants​

Use Cases​

Capabilities​

GitHub Models​

Configuration​

Programmatic Registration​

Environment-Specific Configuration​

Available Models​

Quick Start

Services Registered

Configuration

For GitHub Models

For Azure AI Studio

Constants

Use Cases

Capabilities

GitHub Models

Configuration

Programmatic Registration

Environment-Specific Configuration

Available Models