Jared Van Bortel
9772027e5e
WIP: provider page in the "add models" view
2025-03-19 10:49:39 -04:00
Jared Van Bortel
f7cd880f96
make it build - still plenty of TODOs
2025-03-11 17:08:11 -04:00
Jared Van Bortel
7745f208bc
WIP (clang is crashing)
2025-03-11 13:33:06 -04:00
Jared Van Bortel
1dc9f22d5b
WIP
2025-03-03 11:16:36 -05:00
Jared Van Bortel
1ba555a174
fix handling of responses that come in chunks, and non-200 status codes
2025-02-28 12:38:48 -05:00
Jared Van Bortel
d20cfbbec9
stuff is working now
2025-02-27 18:35:33 -05:00
Jared Van Bortel
068845e1a2
don't duplicate QCoro's exception passing
2025-02-27 14:56:14 -05:00
Jared Van Bortel
ea2ced8c8b
fix json EOF handling
2025-02-27 14:34:13 -05:00
Jared Van Bortel
cc6f995795
fix #includes
2025-02-27 11:50:16 -05:00
Jared Van Bortel
d4e9a6177b
finished initial impl of /show and tested -> hangs!
2025-02-26 20:01:58 -05:00
Jared Van Bortel
7ce2ea57e0
implement /api/show (not tested)
2025-02-26 19:47:25 -05:00
Jared Van Bortel
85eaa41e6d
base url should include /api/
2025-02-26 17:18:56 -05:00
Jared Van Bortel
e872f1db2d
undercores to dashes
2025-02-26 17:14:39 -05:00
Jared Van Bortel
86de26ead2
implement and test /api/tags
2025-02-26 16:58:00 -05:00
Jared Van Bortel
4c5dcf59ea
rename the class to "OllamaClient"
2025-02-26 16:48:05 -05:00
Jared Van Bortel
06475dd113
WIP: use Boost::json for incremental parsing and reflection
2025-02-26 15:57:11 -05:00
Jared Van Bortel
927e963076
parse the JSON response
2025-02-25 16:11:40 -05:00
Jared Van Bortel
b5144decde
fix #includes
2025-02-25 14:58:04 -05:00
Jared Van Bortel
407cb81725
stop using C++20 modules
...
2025 is too soon to use C++ features from 2020 without running into bugs
in every build tool that touches the project.
2025-02-25 12:10:40 -05:00
Jared Van Bortel
1699e77e97
WIP: get build working on macOS
2025-02-25 12:10:09 -05:00
Jared Van Bortel
ebe6352fc8
WIP (hit a clang bug causing an incorrect compiler error)
2025-02-25 12:10:08 -05:00
Jared Van Bortel
196c387bf7
WIP: bring back old backend so we can test the gpt4all-chat build
2025-02-25 12:01:37 -05:00
Jared Van Bortel
729a5b0d9f
ollama-hpp immediately segfaulted. will try something else
...
Signed-off-by: Jared Van Bortel <jared@nomic.ai>
2025-02-25 12:00:48 -05:00
Jared Van Bortel
f4a350d606
WIP: working fmt dep
...
Signed-off-by: Jared Van Bortel <jared@nomic.ai>
2025-02-25 12:00:48 -05:00
Jared Van Bortel
c6d0a1f2b9
enable color diagnostics with ninja
...
Signed-off-by: Jared Van Bortel <jared@nomic.ai>
2025-02-25 12:00:48 -05:00
Jared Van Bortel
b194d71e86
WIP: backend dependencies
...
Signed-off-by: Jared Van Bortel <jared@nomic.ai>
2025-02-25 12:00:48 -05:00
Jared Van Bortel
8e94409be9
WIP: gpt4all backend stub
...
Signed-off-by: Jared Van Bortel <jared@nomic.ai>
2025-02-25 12:00:05 -05:00
Jared Van Bortel
96aeb44210
backend: build with CUDA compute 5.0 support by default ( #3499 )
...
Signed-off-by: Jared Van Bortel <jared@nomic.ai>
2025-02-19 11:27:06 -05:00
ThiloteE
02e12089d3
Add Granite arch to model whitelist ( #3487 )
...
Signed-off-by: ThiloteE <73715071+ThiloteE@users.noreply.github.com>
Signed-off-by: Jared Van Bortel <jared@nomic.ai>
Co-authored-by: Jared Van Bortel <jared@nomic.ai>
2025-02-12 14:17:49 -05:00
Jared Van Bortel
22ebd42c32
Misc fixes for undefined behavior, crashes, and build failure ( #3465 )
...
Signed-off-by: Jared Van Bortel <jared@nomic.ai>
2025-02-06 11:22:52 -05:00
ThiloteE
6ef0bd518e
Whitelist OLMoE and Granite MoE ( #3449 )
...
Signed-off-by: ThiloteE <73715071+ThiloteE@users.noreply.github.com>
Signed-off-by: Jared Van Bortel <jared@nomic.ai>
Co-authored-by: Jared Van Bortel <jared@nomic.ai>
2025-02-04 18:00:07 -05:00
Jared Van Bortel
343a4b6b6a
Support DeepSeek-R1 Qwen ( #3431 )
...
Signed-off-by: Jared Van Bortel <jared@nomic.ai>
2025-01-29 09:51:50 -05:00
Jared Van Bortel
0c70b5a5f4
llamamodel: add missing softmax to fix temperature ( #3202 )
...
Signed-off-by: Jared Van Bortel <jared@nomic.ai>
2024-12-04 10:56:19 -05:00
Jared Van Bortel
225bf6be93
Remove binary state from high-level API and use Jinja templates ( #3147 )
...
Signed-off-by: Jared Van Bortel <jared@nomic.ai>
Signed-off-by: Adam Treat <treat.adam@gmail.com>
Co-authored-by: Adam Treat <treat.adam@gmail.com>
2024-11-25 10:04:17 -05:00
Jared Van Bortel
f07e2e63df
Use the token cache to infer greater n_past and reuse results ( #3073 )
...
Signed-off-by: Jared Van Bortel <jared@nomic.ai>
2024-10-31 11:19:12 -04:00
Jared Van Bortel
c3357b7625
Enable more warning flags, and fix more warnings ( #3065 )
...
Signed-off-by: Jared Van Bortel <jared@nomic.ai>
2024-10-18 12:11:03 -04:00
Jared Van Bortel
8e3108fe1f
Establish basic compiler warnings, and fix a few style issues ( #3039 )
...
Signed-off-by: Jared Van Bortel <jared@nomic.ai>
2024-10-09 09:11:50 -04:00
AT
ea1ade8668
Use different language for prompt size too large. ( #3004 )
...
Signed-off-by: Adam Treat <treat.adam@gmail.com>
Signed-off-by: Jared Van Bortel <jared@nomic.ai>
Co-authored-by: Jared Van Bortel <jared@nomic.ai>
2024-09-27 12:29:22 -04:00
Jared Van Bortel
f9d6be8afb
backend: rebase llama.cpp on upstream as of Sep 26th ( #2998 )
...
Signed-off-by: Jared Van Bortel <jared@nomic.ai>
2024-09-27 12:05:59 -04:00
Ikko Eltociear Ashimine
1047c5e038
docs: update README.md ( #2979 )
...
Signed-off-by: Ikko Eltociear Ashimine <eltociear@gmail.com>
Signed-off-by: AT <manyoso@users.noreply.github.com>
Co-authored-by: AT <manyoso@users.noreply.github.com>
2024-09-23 16:12:52 -04:00
Jared Van Bortel
69782cf713
chat(build): fix broken installer on macOS ( #2973 )
...
Signed-off-by: Jared Van Bortel <jared@nomic.ai>
2024-09-20 15:34:20 -04:00
Jared Van Bortel
39005288c5
server: improve correctness of request parsing and responses ( #2929 )
...
Signed-off-by: Jared Van Bortel <jared@nomic.ai>
2024-09-09 10:48:57 -04:00
Jared Van Bortel
ca151f3519
repo: organize sources, headers, and deps into subdirectories ( #2917 )
...
Signed-off-by: Jared Van Bortel <jared@nomic.ai>
2024-08-27 17:22:40 -04:00
Jared Van Bortel
6518b33697
llamamodel: use greedy sampling when temp=0 ( #2854 )
...
Signed-off-by: Jared Van Bortel <jared@nomic.ai>
2024-08-13 17:04:50 -04:00
Jared Van Bortel
7463b2170b
backend(build): set CUDA arch defaults before enable_language(CUDA) ( #2855 )
...
Signed-off-by: Jared Van Bortel <jared@nomic.ai>
2024-08-13 14:47:48 -04:00
Jared Van Bortel
971c83d1d3
llama.cpp: pull in fix for Kompute-related nvidia-egl crash ( #2843 )
...
Signed-off-by: Jared Van Bortel <jared@nomic.ai>
2024-08-13 11:10:10 -04:00
Jared Van Bortel
26113a17fb
don't use ranges::contains due to clang incompatibility ( #2812 )
...
Signed-off-by: Jared Van Bortel <jared@nomic.ai>
2024-08-08 11:49:01 -04:00
Jared Van Bortel
de7cb36fcc
python: reduce size of wheels built by CI, other build tweaks ( #2802 )
...
* Read CMAKE_CUDA_ARCHITECTURES directly
* Disable CUBINs for python build in CI
* Search for CUDA 11 as well as CUDA 12
Signed-off-by: Jared Van Bortel <jared@nomic.ai>
2024-08-07 11:27:50 -04:00
Jared Van Bortel
be66ec8ab5
chat: faster KV shift, continue generating, fix stop sequences ( #2781 )
...
* Don't stop generating at end of context
* Use llama_kv_cache ops to shift context
* Fix and improve reverse prompt detection
* Replace prompt recalc callback with a flag to disallow context shift
2024-08-07 11:25:24 -04:00
Jared Van Bortel
51bd01ae05
backend: fix extra spaces in tokenization and a CUDA crash ( #2778 )
...
Also potentially improves accuracy of BOS insertion, token cache, and logit indexing.
Signed-off-by: Jared Van Bortel <jared@nomic.ai>
2024-08-01 10:46:36 -04:00