AI Model Provider

Integrating AI model provider into your system can help your tenants improve customer experience by providing faster and more accurate responses.

1 Configure Model Provider Integration in CloudCX
- 1.1 （1）OpenAI
- 1.2 （2）Google Gemini
- 1.3 （3）Ollama
- 1.4 （4）Azure AI Foundry
  - 1.4.1 Azure AI Service
  - 1.4.2 Azure OpenAI Service
- 1.5 （5）OpenAI Compatible
- 1.6 （6）Anthropic
2 Disconnect CloudCX with AI Model Provider
3 Switching embedded model of system settings
4 Reporting

Configure Model Provider Integration in CloudCX

Log in to your system administrator portal https://your_cx_domain:9006, go to Advance > Connector Hub Setting > Model Provider to configure.
We provide you with six AI models, you can choose to install one or more. They are: OpenAI, Gemini, Ollama, Azure AI Foundry, OpenAI Compatible, Anthropic.

When selecting different AI models, the parameters you need to fill in vary. Only when the parameters are filled in correctly can the connection be established successfully.

（1）OpenAI

API Key: Go to https://platform.openai.com/api-keys → Click API Keys → Click Create New Secret Key.

Save the key immediately—it’s only shown once!

（2）Google Gemini

API Key: Visit Google AI Studio → Click Get API key→ Click Get API Key.

（3）Ollama

Model Type: Choose LLM (Large Language Model) or Text Embedding (Text Embedding Model) based on your specific application scenario.
Model Name: The name of locally installed models.
Base URL: The network address and port number for accessing the Ollama service. The format is: http://<IP address or domain name>:<port>.
Completion mode: Select Completion for general text generation and Chat for conversation scenarios.
Model context size: Refers to the upper limit of the total number of input and output tokens that the model can handle.
Upper bound for max tokens: The maximum number of tokens is used to limit the maximum length of text generated by the model. It determines the amount of content output by the model.
Function call support: Choose whether to support function calls. If the model supports function calls, the application can provide the model with a set of functions and their usage instructions, and the model will choose whether to use and how to use these functions based on the user query to achieve richer functionality and more accurate results.

（4）Azure AI Foundry

Azure AI Service

Log in to Microsoft Azure → Create an AI Services resource.
Go to your resource → Keys and Endpoint → Copy the key and endpoint URL.
Completion mode: Select Completion for general text generation and Chat for conversation scenarios.
Model context size: Refers to the upper limit of the total number of input and output tokens that the model can handle.

Azure OpenAI Service

Log in to Microsoft Azure → Create an Azure OpenAI Services resource.
Go to your resource → Keys and Endpoint → Copy the key and endpoint URL.

（5）OpenAI Compatible

Model Type: Choose LLM (Large Language Model) or Text Embedding (Text Embedding Model) based on your specific application scenario.
Model Name: Fill in the model names according to the actual situation.
OpenAI Compatible Endpoint: Provided by your service vendor.(custom URL, e.g., https://your-service.com/v1)
API Key: Generated similarly to OpenAI (check your provider’s docs).

（6）Anthropic

API Key: Sign in to Anthropic Console → Click API Keys → Click Create Key.

If built-in models do not meet your needs, you can add a custom model.

For custom models with the same name but different model IDs, load balancing functionality can be enabled. The load balancing feature can effectively help reduce the pressure of multiple sets of credentials.

After connecting an AI model, you may configure the system model. To use relevant AI functions normally, a model must be selected for the system model setting.

You can set the model temperature and top p.

Temperature: This parameter controls the randomness of the model's predictions. A lower temperature results in more conservative responses, while a higher temperature yields more creative and diverse responses.

Top P: Also known as "nucleus sampling", this parameter sets a threshold for selecting a smaller set of the most likely words to sample from, cutting off the less probable ones. When the Top p value is low, the generated content is more conservative and focuses on high-probability words, making it suitable for professional scenarios. When the value is high, the content is more diverse but may have logical deviations, making it suitable for creative scenarios.

Token Usage

The input token of GPT is the total token of the data sent to AI model provider for analysis by calling the API. This type of token is used most frequently, such as questions asked by customers, summary of retrieved data, etc.
The output token of GPT is the total token of the text content generated by AI model provider. For example, the content automatically replied by the AI agent, the content expanded or summarized by the AI copilot, etc.

Disconnect CloudCX with AI Model Provider

You can click to disconnect from the AI model provider.

Switching embedded model of system settings

After modifying the embedding model, the documents, external URLs, and other contents uploaded by tenants to the dataset need to be resynchronized before they can be used normally.

Reporting

You can view AI input and output token usage by navigating to Reporting > Model Provider Token Usage.

Separate insights: Token usage for chat models and embedded models can be viewed separately.
Filtering capabilities: The view supports filtering by tenant or model, allowing granular analysis.