LLM API Provider Leaderboard | Artificial Analysis

Comparison and ranking of API provider performance for over 100 AI LLM Model endpoints across performance key metrics including price, output speed, latency, context window & others. For more details including relating to our methodology, see our FAQs.

API providers compared: OpenAI, Playground AI, Mistral, Microsoft Azure, Ideogram, Amazon Bedrock, Hyperbolic, Groq, Together.ai, Anthropic, Perplexity, Google, Fireworks, Cerebras, Cohere, Lepton AI, Speechmatics, Deepinfra, Replicate, Runpod, Rev AI, fal.ai, AssemblyAI, DeepSeek, Reka AI, Deepgram, Gladia, Stability.ai, Baseten, Midjourney, Databricks, ElevenLabs, IBM, SambaNova, OctoAI, Cartesia, LMNT, 01.AI, and AI21 Labs.

	Features	Model Quality	Price	Output tokens/s	Latency
						FurtherAnalysis
o1-preview	128k		$26.25	32.0	32.52
o1-mini	128k		$5.25	71.1	14.58
GPT-4o (Aug '24)	128k	100	$4.38	105.5	0.39
GPT-4o (May '24)	128k	100	$7.50	109.2	0.37
GPT-4o (May '24)	128k	100	$7.50	112.3	0.37
GPT-4o mini	128k	88	$0.26	134.5	0.39
Llama 3.1 405B	128k	100	$9.50	18.2	0.24
Llama 3.1 405B	128k	100	$4.00	15.1	0.87
Llama 3.1 405B	128k	100	$7.99	13.4	1.79
Llama 3.1 405B	128k	100	$4.50	30.6	0.42
Llama 3.1 405B	128k	100	$2.80	18.4	1.06
Llama 3.1 405B	128k	100	$8.00	15.1	0.61
Llama 3.1 405B	128k	100	$3.00	72.1	0.63
Llama 3.1 405B	33k	100	$1.79	22.1	0.40
Llama 3.1 405B	8k	100	$6.25	129.2	1.42
Llama 3.1 405B	128k	100	$7.50	27.5	0.66
Llama 3.1 405B Turbo	8k	100	$5.00	91.1	0.57
Llama 3.1 70B	8k	95	$0.60	566.5	0.23
Llama 3.1 70B	128k	95	$0.40	28.7	0.68
Llama 3.1 70B	128k	95	$0.99	31.7	0.70
Llama 3.1 70B	128k	95	$0.90	58.2	0.27
Llama 3.1 70B	128k	95	$0.80	58.5	0.58
Llama 3.1 70B	128k	95	$2.90	27.5	0.58
Llama 3.1 70B	128k	95	$0.90	63.0	0.45
Llama 3.1 70B	128k	95	$0.36	27.9	0.29
Llama 3.1 70B	128k	95	$0.64	249.5	0.44
Llama 3.1 70B	8k	95	$0.75	407.9	0.80
Llama 3.1 70B	128k	95	$1.50	47.0	0.58
Llama 3.1 70B	128k	95	$1.00	51.2	0.24
Llama 3.1 70B Turbo	128k	95	$0.88	77.5	0.46
Llama 3.1 8B	8k	66	$0.10	2,009.0	0.26
Llama 3.1 8B	128k	66	$0.10	91.5	0.52
Llama 3.1 8B	128k	66	$0.22	88.1	0.40
Llama 3.1 8B	128k	66	$0.15	171.0	0.22
Llama 3.1 8B	128k	66	$0.07	208.9	0.36
Llama 3.1 8B	128k	66	$0.38	68.1	0.42
Llama 3.1 8B	128k	66	$0.20	275.1	0.27
Llama 3.1 8B	128k	66	$0.06	80.1	0.20
Llama 3.1 8B	128k	66	$0.06	751.0	0.37
Llama 3.1 8B	8k	66	$0.13	990.3	0.38
Llama 3.1 8B	128k	66	$0.20	160.9	0.17
Llama 3.1 8B Turbo	128k	66	$0.18	191.2	0.36
Gemini 1.5 Pro (Vertex)	2m	95	$5.25	63.7	0.53
Gemini 1.5 Pro (AI Studio)	2m	95	$5.25	65.6	0.84
Gemini 1.5 Flash (Vertex)	1m	84	$0.13	312.9	0.29
Gemini 1.5 Flash (AI Studio)	1m	84	$0.13	311.2	0.37
Gemma 2 27B	8k	78	$0.80	66.1	0.36
Gemma 2 9B	8k	71	$0.06	70.7	0.27
Gemma 2 9B	8k	71	$0.20	671.9	0.18
Gemma 2 9B	8k	71	$0.30	112.4	0.39
Claude 3.5 Sonnet	200k	98	$6.00	51.8	0.95
Claude 3.5 Sonnet	200k	98	$6.00	91.3	0.93
Claude 3 Opus	200k	93	$30.00	23.0	1.73
Claude 3 Opus	200k	93	$30.00	27.5	1.81
Claude 3 Haiku	200k	74	$0.50	119.1	0.45
Claude 3 Haiku	200k	74	$0.50	143.5	0.52
Mistral Large 2	128k	91	$3.00	37.0	0.59
Mistral Large 2	128k	91	$4.50	42.7	0.42
Mixtral 8x22B	65k	71	$3.00	64.9	0.51
Mixtral 8x22B	65k	71	$1.20	37.4	0.31
Mixtral 8x22B	65k	71	$1.20	61.5	0.34
Mixtral 8x22B	65k	71	$0.65	44.2	0.25
Mixtral 8x22B	65k	71	$1.20	71.9	0.38
Mistral Small (Sep '24)	128k		$0.30	80.7	0.44
Mistral NeMo	128k	64	$0.15	135.5	0.41
Mistral NeMo	128k	64	$0.20	157.5	0.20
Mistral NeMo	128k	64	$0.13	63.7	0.20
Mixtral 8x7B	33k	61	$0.70	85.7	0.42
Mixtral 8x7B	33k	61	$0.47	87.6	0.23
Mixtral 8x7B	33k	61	$0.51	65.1	0.36
Mixtral 8x7B	33k	61	$0.45	81.9	0.28
Mixtral 8x7B	33k	61	$0.50	82.7	0.61
Mixtral 8x7B	33k	61	$0.50	100.9	0.32
Mixtral 8x7B	33k	61	$0.24	41.1	0.27
Mixtral 8x7B	33k	61	$0.24	541.8	0.22
Mixtral 8x7B	33k	61	$0.63	86.2	0.49
Mixtral 8x7B	33k	61	$0.60	104.7	0.39
Codestral-Mamba	256k		$0.25	94.8	0.57
Pixtral 12B	128k		$0.15	80.0	0.59
Command-R+ (Aug '24)	128k		$6.00	45.9	0.56
Command-R+ (Aug '24)	128k		$4.38	68.3	0.28
Command-R (Aug '24)	128k		$0.75	102.5	0.39
Command-R (Aug '24)	128k		$0.26	113.0	0.21
Command-R+ (Apr '24)	128k	75	$6.00	45.6	0.57
Command-R+ (Apr '24)	128k	75	$6.00	66.5	0.30
Command-R+ (Apr '24)	128k	75	$6.00	45.9	0.68
Command-R (Mar '24)	128k	63	$0.75	102.5	0.39
Command-R (Mar '24)	128k	63	$0.75	152.2	0.22
Command-R (Mar '24)	128k	63	$0.75	103.6	0.50
Sonar Large	33k		$1.00	45.5	0.24
Sonar Small	33k		$0.20	143.0	0.20
Sonar 3.1 Small	131k		$0.20	137.6	0.18
Sonar 3.1 Large	131k		$1.00	58.7	0.21
Phi-3 Medium 14B	128k		$0.75	50.6	0.46
Phi-3 Medium 14B	4k		$0.14	74.7	0.20
DBRX	33k	62	$1.13	81.8	0.49
DBRX	33k	62	$1.20	104.6	0.34
Reka Core	128k	90	$4.00	14.5	1.13

Key definitions

Back to Navigation

Artificial Analysis Quality Index: Average result across our evaluations covering different dimensions of model intelligence. Currently includes MMLU, GPQA, Math & HumanEval. OpenAI o1 model figures are preliminary and are based on figures stated by OpenAI. See methodology for more details.

Context window: Maximum number of combined input & output tokens. Output tokens commonly have a significantly lower limit (varied by model).

Output Speed: Tokens per second received while the model is generating tokens (ie. after first chunk has been received from the API for models which support streaming).

Latency: Time to first token of tokens received, in seconds, after API request sent. For models which do not support streaming, this represents time to receive the completion.

Price: Price per token, represented as USD per million Tokens. Price is a blend of Input & Output token prices (3:1 ratio).

Output price: Price per token generated by the model (received from the API), represented as USD per million Tokens.

Input price: Price per token included in the request/message sent to the API, represented as USD per million Tokens.

Time period: Metrics are 'live' and are based on the past 14 days of measurements, measurements are taken 8 times a day for single requests and 2 times per day for parallel requests.

Key definitions

Footer

Key Links