Takeoff baseurl support (#10091)

## Description
This PR introduces a minor change to the TitanTakeoff integration. 
Instead of specifying a port on localhost, this PR will allow users to
specify a baseURL instead. This will allow users to use the integration
if they have TitanTakeoff deployed externally (not on localhost). This
removes the hardcoded reference to localhost "http://localhost:{port}".

### Info about Titan Takeoff
Titan Takeoff is an inference server created by
[TitanML](https://www.titanml.co/) that allows you to deploy large
language models locally on your hardware in a single command. Most
generative model architectures are included, such as Falcon, Llama 2,
GPT2, T5 and many more.

Read more about Titan Takeoff here:
-
[Blog](https://medium.com/@TitanML/introducing-titan-takeoff-6c30e55a8e1e)
- [Docs](https://docs.titanml.co/docs/titan-takeoff/getting-started)

### Dependencies
No new dependencies are introduced. However, users will need to install
the titan-iris package in their local environment and start the Titan
Takeoff inferencing server in order to use the Titan Takeoff
integration.

Thanks for your help and please let me know if you have any questions.
cc: @hwchase17 @baskaryan

---------

Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
This commit is contained in:
Blake (Yung Cher Ho)
2023-09-03 22:45:59 +01:00
committed by GitHub
parent 05664a6f20
commit f4bed8a04c
2 changed files with 34 additions and 12 deletions

View File

@@ -10,8 +10,10 @@ from langchain.schema.output import GenerationChunk
class TitanTakeoff(LLM):
port: int = 8000
"""Specifies the port to use for the Titan Takeoff API. Default = 8000."""
base_url: str = "http://localhost:8000"
"""Specifies the baseURL to use for the Titan Takeoff API.
Default = http://localhost:8000.
"""
generate_max_length: int = 128
"""Maximum generation length. Default = 128."""
@@ -92,7 +94,7 @@ class TitanTakeoff(LLM):
text_output += chunk.text
return text_output
url = f"http://localhost:{self.port}/generate"
url = f"{self.base_url}/generate"
params = {"text": prompt, **self._default_params}
response = requests.post(url, json=params)
@@ -139,7 +141,7 @@ class TitanTakeoff(LLM):
response = model(prompt)
"""
url = f"http://localhost:{self.port}/generate_stream"
url = f"{self.base_url}/generate_stream"
params = {"text": prompt, **self._default_params}
response = requests.post(url, json=params, stream=True)
@@ -154,4 +156,4 @@ class TitanTakeoff(LLM):
@property
def _identifying_params(self) -> Mapping[str, Any]:
"""Get the identifying parameters."""
return {"port": self.port, **{}, **self._default_params}
return {"base_url": self.base_url, **{}, **self._default_params}