gpt4all/gpt4all-bindings/python/CHANGELOG.md
Jared Van Bortel de7cb36fcc
python: reduce size of wheels built by CI, other build tweaks (#2802)
* Read CMAKE_CUDA_ARCHITECTURES directly
* Disable CUBINs for python build in CI
* Search for CUDA 11 as well as CUDA 12

Signed-off-by: Jared Van Bortel <jared@nomic.ai>
2024-08-07 11:27:50 -04:00

3.2 KiB

Changelog

All notable changes to this project will be documented in this file.

The format is based on Keep a Changelog.

Unreleased

Changed

  • Search for pip-installed CUDA 11 as well as CUDA 12 (#2802)
  • Stop shipping CUBINs to reduce wheel size (#2802)

2.8.0 - 2024-08-05

Added

  • Support GPT-NeoX, Gemma 2, OpenELM, ChatGLM, and Jais architectures (all with Vulkan support) (#2694)
  • Enable Vulkan support for StarCoder2, XVERSE, Command R, and OLMo (#2694)
  • Support DeepSeek-V2 architecture (no Vulkan support) (#2702)
  • Add Llama 3.1 8B Instruct to models3.json (by @3Simplex in #2731 and #2732)
  • Support Llama 3.1 RoPE scaling (#2758)
  • Add Qwen2-1.5B-Instruct to models3.json (by @ThiloteE in #2759)
  • Detect use of a Python interpreter under Rosetta for a clearer error message (#2793)

Changed

  • Build against CUDA 11.8 instead of CUDA 12 for better compatibility with older drivers (#2639)
  • Update llama.cpp to commit 87e397d00 from July 19th (#2694)

Removed

  • Remove unused internal llmodel_has_gpu_device (#2409)
  • Remove support for GPT-J models (#2676, #2693)

Fixed

  • Fix debug mode crash on Windows and undefined behavior in LLamaModel::embedInternal (#2467)
  • Fix CUDA PTX errors with some GPT4All builds (#2421)
  • Fix mishandling of inputs greater than n_ctx tokens after #1970 (#2498)
  • Fix crash when Kompute falls back to CPU (#2640)
  • Fix several Kompute resource management issues (#2694)
  • Fix crash/hang when some models stop generating, by showing special tokens (#2701)
  • Fix several backend issues (#2778)
    • Restore leading space removal logic that was incorrectly removed in #2694
    • CUDA: Cherry-pick llama.cpp DMMV cols requirement fix that caused a crash with long conversations since #2694