gpt4all

mirror of https://github.com/nomic-ai/gpt4all.git synced 2025-08-17 23:46:55 +00:00

Author	SHA1	Message	Date
AT	8c834a5177	Update llama.cpp to include upstream Llama 3.1 RoPE fix. (#2758 ) Signed-off-by: Adam Treat <treat.adam@gmail.com>	2024-07-27 14:14:19 -04:00
Jared Van Bortel	2a7fe95ff4	llamamodel: always print special tokens (#2701 ) Signed-off-by: Jared Van Bortel <jared@nomic.ai>	2024-07-22 13:32:17 -04:00
Jared Van Bortel	4ca1d0411f	llamamodel: add DeepSeek-V2 to whitelist (#2702 ) Signed-off-by: Jared Van Bortel <jared@nomic.ai>	2024-07-22 13:32:04 -04:00
Jared Van Bortel	290c629442	backend: rebase llama.cpp submodule on latest upstream (#2694 ) * Adds support for GPT-NeoX, Gemma 2, OpenELM, ChatGLM, and Jais architectures (all with Kompute support) * Also enables Kompute support for StarCoder2, XVERSE, Command R, and OLMo * Includes a number of Kompute resource management fixes Signed-off-by: Jared Van Bortel <jared@nomic.ai>	2024-07-19 14:52:58 -04:00
AT	ca72428783	Remove support for GPT-J models. (#2676 ) Signed-off-by: Adam Treat <treat.adam@gmail.com> Signed-off-by: Jared Van Bortel <jared@nomic.ai> Co-authored-by: Jared Van Bortel <jared@nomic.ai>	2024-07-17 16:07:37 -04:00
Jared Van Bortel	6cb3ddafd6	llama.cpp: update submodule for CPU fallback fix (#2640 ) Signed-off-by: Jared Van Bortel <jared@nomic.ai>	2024-07-10 17:56:19 -04:00
Jared Van Bortel	bd307abfe6	backend: fix a crash on inputs greater than n_ctx (#2498 ) This fixes a regression in commit `4fc4d94b` ("fix chat-style prompt templates (#1970)"), which moved some return statements into a new function (LLModel::decodePrompt) without making them return from the parent as well. Signed-off-by: Jared Van Bortel <jared@nomic.ai>	2024-07-01 11:33:46 -04:00
Jared Van Bortel	01870b4a46	chat: fix blank device in UI and improve Mixpanel reporting (#2409 ) Also remove LLModel::hasGPUDevice. Signed-off-by: Jared Van Bortel <jared@nomic.ai>	2024-06-26 15:26:27 -04:00
Jared Van Bortel	da1823ed7a	cmake: fix CMAKE_CUDA_ARCHITECTURES default (#2421 ) Signed-off-by: Jared Van Bortel <jared@nomic.ai>	2024-06-26 14:48:18 -04:00
Jared Van Bortel	88d85be0f9	chat: fix build on Windows and Nomic Embed path on macOS (#2467 ) * chat: remove unused oscompat source files These files are no longer needed now that the hnswlib index is gone. This fixes an issue with the Windows build as there was a compilation error in oscompat.cpp. Signed-off-by: Jared Van Bortel <jared@nomic.ai> * llm: fix pragma to be recognized by MSVC Replaces this MSVC warning: C:\msys64\home\Jared\gpt4all\gpt4all-chat\llm.cpp(53,21): warning C4081: expected '('; found 'string' With this: C:\msys64\home\Jared\gpt4all\gpt4all-chat\llm.cpp : warning : offline installer build will not check for updates! Signed-off-by: Jared Van Bortel <jared@nomic.ai> * usearch: fork usearch to fix `CreateFile` build error Signed-off-by: Jared Van Bortel <jared@nomic.ai> * dlhandle: fix incorrect assertion on Windows SetErrorMode returns the previous value of the error mode flags, not an indicator of success. Signed-off-by: Jared Van Bortel <jared@nomic.ai> * llamamodel: fix UB in LLamaModel::embedInternal It is undefined behavior to increment an STL iterator past the end of the container. Use offsets to do the math instead. Signed-off-by: Jared Van Bortel <jared@nomic.ai> * cmake: install embedding model to bundle's Resources dir on macOS Signed-off-by: Jared Van Bortel <jared@nomic.ai> * ci: fix macOS build by explicitly installing Rosetta Signed-off-by: Jared Van Bortel <jared@nomic.ai> --------- Signed-off-by: Jared Van Bortel <jared@nomic.ai>	2024-06-25 17:22:51 -04:00
AT	9273b49b62	chat: major UI redesign for v3.0.0 (#2396 ) Signed-off-by: Adam Treat <treat.adam@gmail.com> Signed-off-by: Jared Van Bortel <jared@nomic.ai> Co-authored-by: Jared Van Bortel <jared@nomic.ai>	2024-06-24 18:49:23 -04:00
Jared Van Bortel	55d709862f	Revert "typescript bindings maintenance (#2363 )" As discussed on Discord, this PR was not ready to be merged. CI fails on it. This reverts commit `a602f7fde7`. Signed-off-by: Jared Van Bortel <jared@nomic.ai>	2024-06-03 17:26:19 -04:00
Andreas Obersteiner	a602f7fde7	typescript bindings maintenance (#2363 ) * remove outdated comments Signed-off-by: limez <limez@protonmail.com> * simpler build from source Signed-off-by: limez <limez@protonmail.com> * update unix build script to create .so runtimes correctly Signed-off-by: limez <limez@protonmail.com> * configure ci build type, use RelWithDebInfo for dev build script Signed-off-by: limez <limez@protonmail.com> * add clean script Signed-off-by: limez <limez@protonmail.com> * fix streamed token decoding / emoji Signed-off-by: limez <limez@protonmail.com> * remove deprecated nCtx Signed-off-by: limez <limez@protonmail.com> * update typings Signed-off-by: jacob <jacoobes@sern.dev> update typings Signed-off-by: jacob <jacoobes@sern.dev> * readme,mspell Signed-off-by: jacob <jacoobes@sern.dev> * cuda/backend logic changes + name napi methods like their js counterparts Signed-off-by: limez <limez@protonmail.com> * convert llmodel example into a test, separate test suite that can run in ci Signed-off-by: limez <limez@protonmail.com> * update examples / naming Signed-off-by: limez <limez@protonmail.com> * update deps, remove the need for binding.ci.gyp, make node-gyp-build fallback easier testable Signed-off-by: limez <limez@protonmail.com> * make sure the assert-backend-sources.js script is published, but not the others Signed-off-by: limez <limez@protonmail.com> * build correctly on windows (regression on node-gyp-build) Signed-off-by: Jacob Nguyen <76754747+jacoobes@users.noreply.github.com> * codespell Signed-off-by: limez <limez@protonmail.com> * make sure dlhandle.cpp gets linked correctly Signed-off-by: limez <limez@protonmail.com> * add include for check_cxx_compiler_flag call during aarch64 builds Signed-off-by: limez <limez@protonmail.com> * x86 > arm64 cross compilation of runtimes and bindings Signed-off-by: limez <limez@protonmail.com> * default to cpu instead of kompute on arm64 Signed-off-by: limez <limez@protonmail.com> * formatting, more minimal example Signed-off-by: limez <limez@protonmail.com> --------- Signed-off-by: limez <limez@protonmail.com> Signed-off-by: jacob <jacoobes@sern.dev> Signed-off-by: Jacob Nguyen <76754747+jacoobes@users.noreply.github.com> Co-authored-by: Jacob Nguyen <76754747+jacoobes@users.noreply.github.com> Co-authored-by: jacob <jacoobes@sern.dev>	2024-06-03 11:12:55 -05:00
Jared Van Bortel	636307160e	backend: fix #includes with include-what-you-use (#2371 ) Also fix a PARENT_SCOPE warning when building the backend. Signed-off-by: Jared Van Bortel <jared@nomic.ai>	2024-05-31 16:34:54 -04:00
Jared Van Bortel	8ba7ef4832	dlhandle: suppress DLL errors on Windows (#2389 ) Signed-off-by: Jared Van Bortel <jared@nomic.ai>	2024-05-31 16:33:40 -04:00
Jared Van Bortel	4e89a9c44f	backend: support non-ASCII characters in path to llmodel libs on Windows (#2388 ) * backend: refactor dlhandle.h into oscompat.{cpp,h} Signed-off-by: Jared Van Bortel <jared@nomic.ai> * llmodel: alias std::filesystem Signed-off-by: Jared Van Bortel <jared@nomic.ai> * llmodel: use wide strings for paths on Windows Using the native path representation allows us to manipulate paths and call LoadLibraryEx without mangling non-ASCII characters. Signed-off-by: Jared Van Bortel <jared@nomic.ai> * llmodel: prefer built-in std::filesystem functionality Signed-off-by: Jared Van Bortel <jared@nomic.ai> * oscompat: fix string type error Signed-off-by: Jared Van Bortel <jared@nomic.ai> * backend: rename oscompat back to dlhandle Signed-off-by: Jared Van Bortel <jared@nomic.ai> * dlhandle: fix #includes Signed-off-by: Jared Van Bortel <jared@nomic.ai> * dlhandle: remove another #include Signed-off-by: Jared Van Bortel <jared@nomic.ai> * dlhandle: move dlhandle #include Signed-off-by: Jared Van Bortel <jared@nomic.ai> * dlhandle: remove #includes that are covered by dlhandle.h Signed-off-by: Jared Van Bortel <jared@nomic.ai> * llmodel: fix #include order Signed-off-by: Jared Van Bortel <jared@nomic.ai> --------- Signed-off-by: Jared Van Bortel <jared@nomic.ai>	2024-05-31 13:12:28 -04:00
Jared Van Bortel	e94177ee9a	llamamodel: fix embedding crash for >512 tokens after #2310 (#2383 ) Signed-off-by: Jared Van Bortel <jared@nomic.ai>	2024-05-29 10:51:00 -04:00
Jared Van Bortel	f047f383d0	llama.cpp: update submodule for "code" model crash workaround (#2382 ) Signed-off-by: Jared Van Bortel <jared@nomic.ai>	2024-05-29 10:50:00 -04:00
Jared Van Bortel	f1b4092ca6	llamamodel: fix BERT tokenization after llama.cpp update (#2381 ) Signed-off-by: Jared Van Bortel <jared@nomic.ai>	2024-05-28 13:11:57 -04:00
Jared Van Bortel	c779d8a32d	python: init_gpu fixes (#2368 ) * python: tweak GPU init failure message * llama.cpp: update submodule for use-after-free fix Signed-off-by: Jared Van Bortel <jared@nomic.ai>	2024-05-20 18:04:11 -04:00
Jared Van Bortel	2025d2d15b	llmodel: add CUDA to the DLL search path if CUDA_PATH is set (#2357 ) Signed-off-by: Jared Van Bortel <jared@nomic.ai>	2024-05-16 17:39:49 -04:00
Jared Van Bortel	a92d266cea	cmake: fix Metal build after #2310 (#2350 ) I don't understand why this is needed, but it works. Signed-off-by: Jared Van Bortel <jared@nomic.ai>	2024-05-15 18:12:32 -04:00
Jared Van Bortel	d2a99d9bc6	support the llama.cpp CUDA backend (#2310 ) * rebase onto llama.cpp commit ggerganov/llama.cpp@d46dbc76f * support for CUDA backend (enabled by default) * partial support for Occam's Vulkan backend (disabled by default) * partial support for HIP/ROCm backend (disabled by default) * sync llama.cpp.cmake with upstream llama.cpp CMakeLists.txt * changes to GPT4All backend, bindings, and chat UI to handle choice of llama.cpp backend (Kompute or CUDA) * ship CUDA runtime with installed version * make device selection in the UI on macOS actually do something * model whitelist: remove dbrx, mamba, persimmon, plamo; add internlm and starcoder2 Signed-off-by: Jared Van Bortel <jared@nomic.ai>	2024-05-15 15:27:50 -04:00
Jared Van Bortel	9f9d8e636f	backend: do not crash if GGUF lacks general.architecture (#2346 ) Signed-off-by: Jared Van Bortel <jared@nomic.ai>	2024-05-15 13:57:13 -04:00
Jared Van Bortel	6d8888b267	llamamodel: free the batch in embedInternal (#2348 ) Signed-off-by: Jared Van Bortel <jared@nomic.ai>	2024-05-15 12:46:12 -04:00
Jared Van Bortel	577ebd4826	mixpanel: report cpu_supports_avx2 on startup (#2299 ) Signed-off-by: Jared Van Bortel <jared@nomic.ai>	2024-05-02 16:09:41 -04:00
Jared Van Bortel	adaecb7a72	mixpanel: improved GPU device statistics (plus GPU sort order fix) (#2297 ) Signed-off-by: Jared Van Bortel <jared@nomic.ai>	2024-05-01 16:15:48 -04:00
Jared Van Bortel	c622921894	improve mixpanel usage statistics (#2238 ) Other changes: - Always display first start dialog if privacy options are unset (e.g. if the user closed GPT4All without selecting them) - LocalDocs scanQueue is now always deferred - Fix a potential crash in magic_match - LocalDocs indexing is now started after the first start dialog is dismissed so usage stats are included Signed-off-by: Jared Van Bortel <jared@nomic.ai>	2024-04-25 13:16:52 -04:00
Jared Van Bortel	ba53ab5da0	python: do not print GPU name with verbose=False, expose this info via properties (#2222 ) * llamamodel: only print device used in verbose mode Signed-off-by: Jared Van Bortel <jared@nomic.ai> * python: expose backend and device via GPT4All properties Signed-off-by: Jared Van Bortel <jared@nomic.ai> * backend: const correctness fixes Signed-off-by: Jared Van Bortel <jared@nomic.ai> * python: bump version Signed-off-by: Jared Van Bortel <jared@nomic.ai> * python: typing fixups Signed-off-by: Jared Van Bortel <jared@nomic.ai> * python: fix segfault with closed GPT4All Signed-off-by: Jared Van Bortel <jared@nomic.ai> --------- Signed-off-by: Jared Van Bortel <jared@nomic.ai>	2024-04-18 14:52:02 -04:00
Jared Van Bortel	ac498f79ac	fix regressions in system prompt handling (#2219 ) * python: fix system prompt being ignored * fix unintended whitespace after system prompt Signed-off-by: Jared Van Bortel <jared@nomic.ai>	2024-04-15 11:39:48 -04:00
Jared Van Bortel	3f8257c563	llamamodel: fix semantic typo in nomic client dynamic mode (#2216 ) Signed-off-by: Jared Van Bortel <jared@nomic.ai>	2024-04-12 17:25:15 -04:00
Jared Van Bortel	46818e466e	python: embedding cancel callback for nomic client dynamic mode (#2214 ) Signed-off-by: Jared Van Bortel <jared@nomic.ai>	2024-04-12 16:00:39 -04:00
Jared Van Bortel	459289b94c	embed4all: small fixes related to nomic client local embeddings (#2213 ) * actually submit larger batches with increased n_ctx * fix crash when llama_tokenize returns no tokens Signed-off-by: Jared Van Bortel <jared@nomic.ai>	2024-04-12 10:54:15 -04:00
Jared Van Bortel	1b84a48c47	python: add list_gpus to the GPT4All API (#2194 ) Other changes: * fix memory leak in llmodel_available_gpu_devices * drop model argument from llmodel_available_gpu_devices * breaking: make GPT4All/Embed4All arguments past model_name keyword-only Signed-off-by: Jared Van Bortel <jared@nomic.ai>	2024-04-04 14:52:13 -04:00
Jared Van Bortel	67843edc7c	backend: update llama.cpp submodule for wpm locale fix (#2163 ) Signed-off-by: Jared Van Bortel <jared@nomic.ai>	2024-03-26 11:04:22 -04:00
Jared Van Bortel	83ada4ca89	backend: update llama.cpp submodule for Unicode paths fix (#2162 ) Signed-off-by: Jared Van Bortel <jared@nomic.ai>	2024-03-26 11:01:02 -04:00
Jared Van Bortel	0455b80b7f	Embed4All: optionally count tokens, misc fixes (#2145 ) Key changes: * python: optionally return token count in Embed4All.embed * python and docs: models2.json -> models3.json * Embed4All: require explicit prefix for unknown models * llamamodel: fix shouldAddBOS for Bert and Nomic Bert Signed-off-by: Jared Van Bortel <jared@nomic.ai>	2024-03-20 11:24:02 -04:00
Jared Van Bortel	a1bb6084ed	python: documentation update and typing improvements (#2129 ) Key changes: * revert "python: tweak constructor docstrings" * docs: update python GPT4All and Embed4All documentation * breaking: require keyword args to GPT4All.generate Signed-off-by: Jared Van Bortel <jared@nomic.ai>	2024-03-19 17:25:22 -04:00
Jared Van Bortel	699410014a	fix non-AVX CPU detection (#2141 ) * chat: fix non-AVX CPU detection on Windows * bindings: throw exception instead of logging to console Signed-off-by: Jared Van Bortel <jared@nomic.ai>	2024-03-19 10:56:14 -04:00
Jared Van Bortel	255568fb9a	python: various fixes for GPT4All and Embed4All (#2130 ) Key changes: * honor empty system prompt argument * current_chat_session is now read-only and defaults to None * deprecate fallback prompt template for unknown models * fix mistakes from #2086 Signed-off-by: Jared Van Bortel <jared@nomic.ai>	2024-03-15 11:49:58 -04:00
Jared Van Bortel	53f109f519	llamamodel: fix macOS build (#2125 ) Signed-off-by: Jared Van Bortel <jared@nomic.ai>	2024-03-14 12:06:07 -04:00
Jared Van Bortel	406e88b59a	implement local Nomic Embed via llama.cpp (#2086 ) Signed-off-by: Jared Van Bortel <jared@nomic.ai>	2024-03-13 18:09:24 -04:00
Jared Van Bortel	5c248dbec9	models: new MPT model file without duplicated token_embd.weight (#2006 ) Signed-off-by: Jared Van Bortel <jared@nomic.ai>	2024-03-08 17:18:38 -05:00
Jared Van Bortel	c19b763e03	llmodel_c: expose fakeReply to the bindings (#2061 ) Signed-off-by: Jared Van Bortel <jared@nomic.ai>	2024-03-06 13:32:24 -05:00
Jared Van Bortel	f500bcf6e5	llmodel: default to a blank line between reply and next prompt (#1996 ) Also make some related adjustments to the provided Alpaca-style prompt templates and system prompts. Signed-off-by: Jared Van Bortel <jared@nomic.ai>	2024-02-26 13:11:15 -05:00
Jared Van Bortel	007d469034	bert: fix layer norm epsilon value (#1946 ) Signed-off-by: Jared Van Bortel <jared@nomic.ai>	2024-02-26 13:09:01 -05:00
Adam Treat	f720261d46	Fix another vulnerable spot for crashes. Signed-off-by: Adam Treat <treat.adam@gmail.com>	2024-02-26 12:04:16 -06:00
chrisbarrera	f8b1069a1c	add min_p sampling parameter (#2014 ) Signed-off-by: Christopher Barrera <cb@arda.tx.rr.com> Co-authored-by: Jared Van Bortel <cebtenzzre@gmail.com>	2024-02-24 17:51:34 -05:00
Jared Van Bortel	e7f2ff189f	fix some compilation warnings on macOS Signed-off-by: Jared Van Bortel <jared@nomic.ai>	2024-02-22 15:09:06 -05:00
Jared Van Bortel	88e330ef0e	llama.cpp: enable Kompute support for 10 more model arches (#2005 ) These are Baichuan, Bert and Nomic Bert, CodeShell, GPT-2, InternLM, MiniCPM, Orion, Qwen, and StarCoder. Signed-off-by: Jared Van Bortel <jared@nomic.ai>	2024-02-22 14:34:42 -05:00
Jared Van Bortel	fc6c5ea0c7	llama.cpp: gemma: allow offloading the output tensor (#1997 ) Signed-off-by: Jared Van Bortel <jared@nomic.ai>	2024-02-22 14:06:18 -05:00
Jared Van Bortel	4fc4d94be4	fix chat-style prompt templates (#1970 ) Also use a new version of Mistral OpenOrca. Signed-off-by: Jared Van Bortel <jared@nomic.ai>	2024-02-21 15:45:32 -05:00
Jared Van Bortel	7810b757c9	llamamodel: add gemma model support Signed-off-by: Jared Van Bortel <jared@nomic.ai>	2024-02-21 13:36:31 -06:00
Adam Treat	d948a4f2ee	Complete revamp of model loading to allow for more discreet control by the user of the models loading behavior. Signed-off-by: Adam Treat <treat.adam@gmail.com>	2024-02-21 10:15:20 -06:00
Jared Van Bortel	6fdec808b2	backend: update llama.cpp for faster state serialization Signed-off-by: Jared Van Bortel <jared@nomic.ai>	2024-02-13 17:39:18 -05:00
Jared Van Bortel	a1471becf3	backend: update llama.cpp for Intel GPU blacklist Signed-off-by: Jared Van Bortel <jared@nomic.ai>	2024-02-12 13:16:24 -05:00
Jared Van Bortel	eb1081d37e	cmake: fix LLAMA_DIR use before set Signed-off-by: Jared Van Bortel <jared@nomic.ai>	2024-02-09 22:00:14 -05:00
Jared Van Bortel	e60b388a2e	cmake: fix backwards LLAMA_KOMPUTE default Signed-off-by: Jared Van Bortel <jared@nomic.ai>	2024-02-09 21:53:32 -05:00
Jared Van Bortel	fc7e5f4a09	ci: fix missing Kompute support in python bindings (#1953 ) Signed-off-by: Jared Van Bortel <jared@nomic.ai>	2024-02-09 21:40:32 -05:00
Jared Van Bortel	bf493bb048	Mixtral crash fix and python bindings v2.2.0 (#1931 ) Signed-off-by: Jared Van Bortel <jared@nomic.ai>	2024-02-06 11:01:15 -05:00
Jared Van Bortel	92c025a7f6	llamamodel: add 12 new architectures for CPU inference (#1914 ) Baichuan, BLOOM, CodeShell, GPT-2, Orion, Persimmon, Phi and Phi-2, Plamo, Qwen, Qwen2, Refact, StableLM Signed-off-by: Jared Van Bortel <jared@nomic.ai>	2024-02-05 16:49:31 -05:00
Jared Van Bortel	10e3f7bbf5	Fix VRAM leak when model loading fails (#1901 ) Signed-off-by: Jared Van Bortel <jared@nomic.ai>	2024-02-01 15:45:45 -05:00
Jared Van Bortel	eadc3b8d80	backend: bump llama.cpp for VRAM leak fix when switching models Signed-off-by: Jared Van Bortel <jared@nomic.ai>	2024-01-31 17:24:01 -05:00
Jared Van Bortel	6db5307730	update llama.cpp for unhandled Vulkan OOM exception fix Signed-off-by: Jared Van Bortel <jared@nomic.ai>	2024-01-31 16:44:58 -05:00
Jared Van Bortel	0a40e71652	Maxwell/Pascal GPU support and crash fix (#1895 ) Signed-off-by: Jared Van Bortel <jared@nomic.ai>	2024-01-31 16:32:32 -05:00
Jared Van Bortel	b11c3f679e	bump llama.cpp-mainline for C++11 compat Signed-off-by: Jared Van Bortel <jared@nomic.ai>	2024-01-31 15:02:34 -05:00
Jared Van Bortel	061d1969f8	expose n_gpu_layers parameter of llama.cpp (#1890 ) Also dynamically limit the GPU layers and context length fields to the maximum supported by the model. Signed-off-by: Jared Van Bortel <jared@nomic.ai>	2024-01-31 14:17:44 -05:00
Jared Van Bortel	f549d5a70a	backend : quick llama.cpp update to fix fallback to CPU Signed-off-by: Jared Van Bortel <jared@nomic.ai>	2024-01-29 17:16:40 -05:00
Jared Van Bortel	38c61493d2	backend: update to latest commit of llama.cpp Vulkan PR Signed-off-by: Jared Van Bortel <jared@nomic.ai>	2024-01-29 15:47:26 -06:00
Jared Van Bortel	26acdebafa	convert: replace GPTJConfig with AutoConfig (#1866 ) Signed-off-by: Jared Van Bortel <jared@nomic.ai>	2024-01-22 12:14:55 -05:00
Jared Van Bortel	a9c5f53562	update llama.cpp for nomic-ai/llama.cpp#12 Fixes #1477 Signed-off-by: Jared Van Bortel <jared@nomic.ai>	2024-01-17 14:05:33 -05:00
Jared Van Bortel	b7c92c5afd	sync llama.cpp with latest Vulkan PR and newer upstream (#1819 )	2024-01-16 16:36:21 -05:00
Jared Van Bortel	7e9786fccf	chat: set search path early This fixes the issues with installed versions of v2.6.0.	2024-01-11 12:04:18 -05:00
AT	96cee4f9ac	Explicitly clear the kv cache each time we eval tokens to match n_past. (#1808 )	2024-01-03 14:06:08 -05:00
ThiloteE	2d566710e5	Address review	2024-01-03 11:13:07 -06:00
ThiloteE	a0f7d7ae0e	Fix for "LLModel ERROR: Could not find CPU LLaMA implementation" v2	2024-01-03 11:13:07 -06:00
ThiloteE	38d81c14d0	Fixes https://github.com/nomic-ai/gpt4all/issues/1760 LLModel ERROR: Could not find CPU LLaMA implementation. Inspired by Microsoft docs for LoadLibraryExA (https://learn.microsoft.com/en-us/windows/win32/api/libloaderapi/nf-libloaderapi-loadlibraryexa). When using LOAD_LIBRARY_SEARCH_DLL_LOAD_DIR, the lpFileName parameter must specify a fully qualified path, also it needs to be backslashes (\), not forward slashes (/).	2024-01-03 11:13:07 -06:00
Jared Van Bortel	d1c56b8b28	Implement configurable context length (#1749 )	2023-12-16 17:58:15 -05:00
Jared Van Bortel	3acbef14b7	fix AVX support by removing direct linking to AVX2 libs (#1750 )	2023-12-13 12:11:09 -05:00
Jared Van Bortel	0600f551b3	chatllm: do not attempt to serialize incompatible state (#1742 )	2023-12-12 11:45:03 -05:00
Jared Van Bortel	1df3da0a88	update llama.cpp for clang warning fix	2023-12-11 13:07:41 -05:00
Jared Van Bortel	dfd8ef0186	backend: use ggml_new_graph for GGML backend v2 (#1719 )	2023-12-06 14:38:53 -05:00
Jared Van Bortel	9e28dfac9c	Update to latest llama.cpp (#1706 )	2023-12-01 16:51:15 -05:00
Adam Treat	cce5fe2045	Fix macos build.	2023-11-17 11:59:31 -05:00
Adam Treat	371e2a5cbc	LocalDocs version 2 with text embeddings.	2023-11-17 11:59:31 -05:00
Jared Van Bortel	d4ce9f4a7c	llmodel_c: improve quality of error messages (#1625 )	2023-11-07 11:20:14 -05:00
cebtenzzre	64101d3af5	update llama.cpp-mainline	2023-11-01 09:47:39 -04:00
Adam Treat	ffef60912f	Update to llama.cpp	2023-10-30 11:40:16 -04:00
Adam Treat	f5f22fdbd0	Update llama.cpp for latest bugfixes.	2023-10-28 17:47:55 -04:00
cebtenzzre	7bcd9e8089	update llama.cpp-mainline	2023-10-27 19:29:36 -04:00
cebtenzzre	fd0c501d68	backend: support GGUFv3 (#1582 )	2023-10-27 17:07:23 -04:00
Adam Treat	14b410a12a	Update to latest version of llama.cpp which fixes issue 1507.	2023-10-27 12:08:35 -04:00
Adam Treat	ab96035bec	Update to llama.cpp submodule for some vulkan fixes.	2023-10-26 13:46:38 -04:00
cebtenzzre	e90263c23f	make scripts executable (#1555 )	2023-10-24 09:28:21 -04:00
Aaron Miller	f414c28589	llmodel: whitelist library name patterns this fixes some issues that were being seen on installed windows builds of 2.5.0 only load dlls that actually might be model impl dlls, otherwise we pull all sorts of random junk into the process before it might expect to be Signed-off-by: Aaron Miller <apage43@ninjawhale.com>	2023-10-23 21:40:14 -07:00
cebtenzzre	4338e72a51	MPT: use upstream llama.cpp implementation (#1515 )	2023-10-19 15:25:17 -04:00
cebtenzzre	0fe2e19691	llamamodel: re-enable error messages by default (#1537 )	2023-10-19 13:46:33 -04:00
cebtenzzre	017c3a9649	python: prepare version 2.0.0rc1 (#1529 )	2023-10-18 20:24:54 -04:00
cebtenzzre	9a19c740ee	kompute: fix library loading issues with kp_logger (#1517 )	2023-10-16 16:58:17 -04:00
Aaron Miller	f79557d2aa	speedup: just use matvec shaders for matmat so far my from-scratch matmats are still slower than just running more invocations of the existing Metal ported matvec shaders - it should be theoretically possible to make a matmat that's faster (for actual matmat cases) than an optimal matvec, but it will need to be at least* as fast as the mat*vec op and then take special care to be cache-friendly and save memory bandwidth, as the # of compute ops is the same	2023-10-16 13:45:51 -04:00

1 2 3 4 5 ...

318 Commits