Jared Van Bortel 
							
						 
					 
					
						
						
							
						
						26113a17fb 
					 
					
						
						
							
							don't use ranges::contains due to clang incompatibility ( #2812 )  
						
						... 
						
						
						
						Signed-off-by: Jared Van Bortel <jared@nomic.ai > 
						
						
					 
					
						2024-08-08 11:49:01 -04:00 
						 
				 
			
				
					
						
							
							
								Jared Van Bortel 
							
						 
					 
					
						
						
							
						
						be66ec8ab5 
					 
					
						
						
							
							chat: faster KV shift, continue generating, fix stop sequences ( #2781 )  
						
						... 
						
						
						
						* Don't stop generating at end of context
* Use llama_kv_cache ops to shift context
* Fix and improve reverse prompt detection
* Replace prompt recalc callback with a flag to disallow context shift 
						
						
					 
					
						2024-08-07 11:25:24 -04:00 
						 
				 
			
				
					
						
							
							
								Jared Van Bortel 
							
						 
					 
					
						
						
							
						
						51bd01ae05 
					 
					
						
						
							
							backend: fix extra spaces in tokenization and a CUDA crash ( #2778 )  
						
						... 
						
						
						
						Also potentially improves accuracy of BOS insertion, token cache, and logit indexing.
Signed-off-by: Jared Van Bortel <jared@nomic.ai > 
						
						
					 
					
						2024-08-01 10:46:36 -04:00 
						 
				 
			
				
					
						
							
							
								Jared Van Bortel 
							
						 
					 
					
						
						
							
						
						bd307abfe6 
					 
					
						
						
							
							backend: fix a crash on inputs greater than n_ctx ( #2498 )  
						
						... 
						
						
						
						This fixes a regression in commit 4fc4d94b#1970 )"), which moved some return statements into a new
function (LLModel::decodePrompt) without making them return from the
parent as well.
Signed-off-by: Jared Van Bortel <jared@nomic.ai > 
						
						
					 
					
						2024-07-01 11:33:46 -04:00 
						 
				 
			
				
					
						
							
							
								AT 
							
						 
					 
					
						
						
							
						
						9273b49b62 
					 
					
						
						
							
							chat: major UI redesign for v3.0.0 ( #2396 )  
						
						... 
						
						
						
						Signed-off-by: Adam Treat <treat.adam@gmail.com >
Signed-off-by: Jared Van Bortel <jared@nomic.ai >
Co-authored-by: Jared Van Bortel <jared@nomic.ai > 
						
						
					 
					
						2024-06-24 18:49:23 -04:00 
						 
				 
			
				
					
						
							
							
								Jared Van Bortel 
							
						 
					 
					
						
						
							
						
						636307160e 
					 
					
						
						
							
							backend: fix #includes with include-what-you-use ( #2371 )  
						
						... 
						
						
						
						Also fix a PARENT_SCOPE warning when building the backend.
Signed-off-by: Jared Van Bortel <jared@nomic.ai > 
						
						
					 
					
						2024-05-31 16:34:54 -04:00 
						 
				 
			
				
					
						
							
							
								Jared Van Bortel 
							
						 
					 
					
						
						
							
						
						46818e466e 
					 
					
						
						
							
							python: embedding cancel callback for nomic client dynamic mode ( #2214 )  
						
						... 
						
						
						
						Signed-off-by: Jared Van Bortel <jared@nomic.ai > 
						
						
					 
					
						2024-04-12 16:00:39 -04:00 
						 
				 
			
				
					
						
							
							
								Jared Van Bortel 
							
						 
					 
					
						
						
							
						
						0455b80b7f 
					 
					
						
						
							
							Embed4All: optionally count tokens, misc fixes ( #2145 )  
						
						... 
						
						
						
						Key changes:
* python: optionally return token count in Embed4All.embed
* python and docs: models2.json -> models3.json
* Embed4All: require explicit prefix for unknown models
* llamamodel: fix shouldAddBOS for Bert and Nomic Bert
Signed-off-by: Jared Van Bortel <jared@nomic.ai > 
						
						
					 
					
						2024-03-20 11:24:02 -04:00 
						 
				 
			
				
					
						
							
							
								Jared Van Bortel 
							
						 
					 
					
						
						
							
						
						406e88b59a 
					 
					
						
						
							
							implement local Nomic Embed via llama.cpp ( #2086 )  
						
						... 
						
						
						
						Signed-off-by: Jared Van Bortel <jared@nomic.ai > 
						
						
					 
					
						2024-03-13 18:09:24 -04:00 
						 
				 
			
				
					
						
							
							
								Jared Van Bortel 
							
						 
					 
					
						
						
							
						
						f500bcf6e5 
					 
					
						
						
							
							llmodel: default to a blank line between reply and next prompt ( #1996 )  
						
						... 
						
						
						
						Also make some related adjustments to the provided Alpaca-style prompt templates
and system prompts.
Signed-off-by: Jared Van Bortel <jared@nomic.ai > 
						
						
					 
					
						2024-02-26 13:11:15 -05:00 
						 
				 
			
				
					
						
							
							
								Jared Van Bortel 
							
						 
					 
					
						
						
							
						
						4fc4d94be4 
					 
					
						
						
							
							fix chat-style prompt templates ( #1970 )  
						
						... 
						
						
						
						Also use a new version of Mistral OpenOrca.
Signed-off-by: Jared Van Bortel <jared@nomic.ai > 
						
						
					 
					
						2024-02-21 15:45:32 -05:00 
						 
				 
			
				
					
						
							
							
								Jared Van Bortel 
							
						 
					 
					
						
						
							
						
						3acbef14b7 
					 
					
						
						
							
							fix AVX support by removing direct linking to AVX2 libs ( #1750 )  
						
						
						
						
					 
					
						2023-12-13 12:11:09 -05:00 
						 
				 
			
				
					
						
							
							
								Adam Treat 
							
						 
					 
					
						
						
							
						
						12f943e966 
					 
					
						
						
							
							Fix regenerate button to be deterministic and bump the llama version to latest we have for gguf.  
						
						
						
						
					 
					
						2023-10-05 18:16:19 -04:00 
						 
				 
			
				
					
						
							
							
								Adam Treat 
							
						 
					 
					
						
						
							
						
						045f6e6cdc 
					 
					
						
						
							
							Link against ggml in bin so we can get the available devices without loading a model.  
						
						
						
						
					 
					
						2023-09-15 14:45:25 -04:00 
						 
				 
			
				
					
						
							
							
								Adam Treat 
							
						 
					 
					
						
						
							
						
						0efdbfcffe 
					 
					
						
						
							
							Bert  
						
						
						
						
					 
					
						2023-07-13 14:21:46 -04:00 
						 
				 
			
				
					
						
							
							
								Adam Treat 
							
						 
					 
					
						
						
							
						
						315a1f2aa2 
					 
					
						
						
							
							Move it back as internal class.  
						
						
						
						
					 
					
						2023-07-13 14:21:46 -04:00 
						 
				 
			
				
					
						
							
							
								Adam Treat 
							
						 
					 
					
						
						
							
						
						1f749d7633 
					 
					
						
						
							
							Clean up backend code a bit and hide impl. details.  
						
						
						
						
					 
					
						2023-07-13 14:21:46 -04:00 
						 
				 
			
				
					
						
							
							
								Aaron Miller 
							
						 
					 
					
						
						
							
						
						7a5f6e4726 
					 
					
						
						
							
							limit prompt batch size to 128  
						
						
						
						
					 
					
						2023-06-30 21:07:21 -03:00 
						 
				 
			
				
					
						
							
							
								Aaron Miller 
							
						 
					 
					
						
						
							
						
						88616fde7f 
					 
					
						
						
							
							llmodel: change tokenToString to not use string_view ( #968 )  
						
						... 
						
						
						
						fixes a definite use-after-free and likely avoids some other
potential ones - std::string will convert to a std::string_view
automatically but as soon as the std::string in question goes out of
scope it is already freed and the string_view is pointing at freed
memory - this is *mostly* fine if its returning a reference to the
tokenizer's internal vocab table but it's, imo, too easy to return a
reference to a dynamically constructed string with this as replit is
doing (and unfortunately needs to do to convert the internal whitespace
replacement symbol back to a space) 
						
						
					 
					
						2023-06-13 07:14:02 -04:00 
						 
				 
			
				
					
						
							
							
								Adam Treat 
							
						 
					 
					
						
						
							
						
						301d2fdbea 
					 
					
						
						
							
							Fix up for newer models on reset context. This fixes the model from totally failing after a reset context.  
						
						
						
						
					 
					
						2023-06-04 19:31:20 -04:00 
						 
				 
			
				
					
						
							
							
								AT 
							
						 
					 
					
						
						
							
						
						bbe195ee02 
					 
					
						
						
							
							Backend prompt dedup ( #822 )  
						
						... 
						
						
						
						* Deduplicated prompt() function code 
						
						
					 
					
						2023-06-04 08:59:24 -04:00 
						 
				 
			
				
					
						
							
							
								Adam Treat 
							
						 
					 
					
						
						
							
						
						70e3b7e907 
					 
					
						
						
							
							Try and fix build on mac.  
						
						
						
						
					 
					
						2023-06-02 10:47:12 -04:00