Changelog
Meta Llama 3.1 now available on Workers AI
Workers AI now suppoorts Meta Llama 3.1.
New community-contributed tutorial
- Added community contributed tutorial on how to create APIs to recommend products on e-commerce sites using Workers AI and Stripe.
Introducing embedded function calling
- A new way to do function calling with Embedded function calling
- Published new
@cloudflare/ai-utils
npm package - Open-sourced
ai-utils on Github
Added support for traditional function calling
- Function calling is now supported on enabled models
- Properties added on models page to show which models support function calling
Native support for AI Gateways
Workers AI now natively supports AI Gateway.
Deprecation announcement for `@cf/meta/llama-2-7b-chat-int8`
We will be deprecating @cf/meta/llama-2-7b-chat-int8
on 2024-06-30.
Replace the model ID in your code with a new model of your choice:
@cf/meta/llama-3-8b-instruct
is the newest model in the Llama family (and is currently free for a limited time on Workers AI).@cf/meta/llama-3-8b-instruct-awq
is the new Llama 3 in a similar precision to your currently selected model. This model is also currently free for a limited time.
If you do not switch to a different model by June 30th, we will automatically start returning inference from @cf/meta/llama-3-8b-instruct-awq
.
Add new public LoRAs and note on LoRA routing
- Added documentation on new public LoRAs.
- Noted that you can now run LoRA inference with the base model rather than explicitly calling the
-lora
version
Add OpenAI compatible API endpoints
Added OpenAI compatible API endpoints for /v1/chat/completions
and /v1/embeddings
. For more details, refer to Configurations.
Add AI native binding
- Added new AI native binding, you can now run models with
const resp = await env.AI.run(modelName, inputs)
- Deprecated
@cloudflare/ai
npm package. While existing solutions using the @cloudflare/ai package will continue to work, no new Workers AI features will be supported. Moving to native AI bindings is highly recommended