ollama/llm
Daniel Hiltgen 534342e7e2
Update MLX and MLX-C with threading fixes (#15845)
* Update MLX and MLX-C

* Run MLX CGO work on a locked OS thread

MLX now relies on OS-thread-local execution state for streams, encoders, and caches. Add an mlxthread executor backed by runtime.LockOSThread and route runner initialization, model load, inference, status memory reads, and cleanup through the worker so Go goroutine migration cannot split MLX state across native threads.

Also stop caching default MLX streams before the runner owns the thread and add worker/threaded MLX regression tests.

* mlx: use common status writer

* mlx: bundle missing libjaccl on arm64

Inspired by #15793

* review comments
2026-05-03 10:03:14 -07:00
..
llm_darwin.go Optimize container images for startup (#6547) 2024-09-12 12:10:30 -07:00
llm_linux.go Optimize container images for startup (#6547) 2024-09-12 12:10:30 -07:00
llm_windows.go win: lint fix (#10571) 2025-05-05 11:08:12 -07:00
server.go metal: harden for ggml initialization failures (#15755) 2026-04-30 16:28:03 -07:00
server_test.go llm: Don't always evict models on CPU-only systems 2025-12-02 10:58:08 -08:00
server_wait_test.go metal: harden for ggml initialization failures (#15755) 2026-04-30 16:28:03 -07:00
status.go Update MLX and MLX-C with threading fixes (#15845) 2026-05-03 10:03:14 -07:00
status_test.go Update MLX and MLX-C with threading fixes (#15845) 2026-05-03 10:03:14 -07:00