langchain.llms.textgen.TextGen¶
- class langchain.llms.textgen.TextGen(*, cache: Optional[bool] = None, verbose: bool = None, callbacks: Optional[Union[List[BaseCallbackHandler], BaseCallbackManager]] = None, callback_manager: Optional[BaseCallbackManager] = None, tags: Optional[List[str]] = None, metadata: Optional[Dict[str, Any]] = None, model_url: str, preset: Optional[str] = None, max_new_tokens: Optional[int] = 250, do_sample: bool = True, temperature: Optional[float] = 1.3, top_p: Optional[float] = 0.1, typical_p: Optional[float] = 1, epsilon_cutoff: Optional[float] = 0, eta_cutoff: Optional[float] = 0, repetition_penalty: Optional[float] = 1.18, top_k: Optional[float] = 40, min_length: Optional[int] = 0, no_repeat_ngram_size: Optional[int] = 0, num_beams: Optional[int] = 1, penalty_alpha: Optional[float] = 0, length_penalty: Optional[float] = 1, early_stopping: bool = False, seed: int = - 1, add_bos_token: bool = True, truncation_length: Optional[int] = 2048, ban_eos_token: bool = False, skip_special_tokens: bool = True, stopping_strings: Optional[List[str]] = [], streaming: bool = False)[source]¶
Bases:
LLMtext-generation-webui models.
To use, you should have the text-generation-webui installed, a model loaded, and –api added as a command-line option.
Suggested installation, use one-click installer for your OS: https://github.com/oobabooga/text-generation-webui#one-click-installers
Parameters below taken from text-generation-webui api example: https://github.com/oobabooga/text-generation-webui/blob/main/api-examples/api-example.py
Example
from langchain.llms import TextGen llm = TextGen(model_url="http://localhost:8500")
Create a new model by parsing and validating input data from keyword arguments.
Raises ValidationError if the input data cannot be parsed to form a valid model.
- param add_bos_token: bool = True¶
Add the bos_token to the beginning of prompts. Disabling this can make the replies more creative.
- param ban_eos_token: bool = False¶
Ban the eos_token. Forces the model to never end the generation prematurely.
- param cache: Optional[bool] = None¶
- param callback_manager: Optional[BaseCallbackManager] = None¶
- param callbacks: Callbacks = None¶
- param do_sample: bool = True¶
Do sample
- param early_stopping: bool = False¶
Early stopping
- param epsilon_cutoff: Optional[float] = 0¶
Epsilon cutoff
- param eta_cutoff: Optional[float] = 0¶
ETA cutoff
- param length_penalty: Optional[float] = 1¶
Length Penalty
- param max_new_tokens: Optional[int] = 250¶
The maximum number of tokens to generate.
- param metadata: Optional[Dict[str, Any]] = None¶
Metadata to add to the run trace.
- param min_length: Optional[int] = 0¶
Minimum generation length in tokens.
- param model_url: str [Required]¶
The full URL to the textgen webui including http[s]://host:port
- param no_repeat_ngram_size: Optional[int] = 0¶
If not set to 0, specifies the length of token sets that are completely blocked from repeating at all. Higher values = blocks larger phrases, lower values = blocks words or letters from repeating. Only 0 or high values are a good idea in most cases.
- param num_beams: Optional[int] = 1¶
Number of beams
- param penalty_alpha: Optional[float] = 0¶
Penalty Alpha
- param preset: Optional[str] = None¶
The preset to use in the textgen webui
- param repetition_penalty: Optional[float] = 1.18¶
Exponential penalty factor for repeating prior tokens. 1 means no penalty, higher value = less repetition, lower value = more repetition.
- param seed: int = -1¶
Seed (-1 for random)
- param skip_special_tokens: bool = True¶
Skip special tokens. Some specific models need this unset.
- param stopping_strings: Optional[List[str]] = []¶
A list of strings to stop generation when encountered.
- param streaming: bool = False¶
Whether to stream the results, token by token (currently unimplemented).
- param tags: Optional[List[str]] = None¶
Tags to add to the run trace.
- param temperature: Optional[float] = 1.3¶
Primary factor to control randomness of outputs. 0 = deterministic (only the most likely token is used). Higher value = more randomness.
- param top_k: Optional[float] = 40¶
Similar to top_p, but select instead only the top_k most likely tokens. Higher value = higher range of possible random results.
- param top_p: Optional[float] = 0.1¶
If not set to 1, select tokens with probabilities adding up to less than this number. Higher value = higher range of possible random results.
- param truncation_length: Optional[int] = 2048¶
Truncate the prompt up to this length. The leftmost tokens are removed if the prompt exceeds this length. Most models require this to be at most 2048.
- param typical_p: Optional[float] = 1¶
If not set to 1, select only tokens that are at least this much more likely to appear than random tokens, given the prior text.
- param verbose: bool [Optional]¶
Whether to print out response text.
- __call__(prompt: str, stop: Optional[List[str]] = None, callbacks: Optional[Union[List[BaseCallbackHandler], BaseCallbackManager]] = None, *, tags: Optional[List[str]] = None, metadata: Optional[Dict[str, Any]] = None, **kwargs: Any) str¶
Check Cache and run the LLM on the given prompt and input.
- async abatch(inputs: List[Union[PromptValue, str, List[BaseMessage]]], config: Optional[Union[RunnableConfig, List[RunnableConfig]]] = None, max_concurrency: Optional[int] = None, **kwargs: Any) List[str]¶
- async agenerate(prompts: List[str], stop: Optional[List[str]] = None, callbacks: Union[List[BaseCallbackHandler], BaseCallbackManager, None, List[Optional[Union[List[BaseCallbackHandler], BaseCallbackManager]]]] = None, *, tags: Optional[Union[List[str], List[List[str]]]] = None, metadata: Optional[Union[Dict[str, Any], List[Dict[str, Any]]]] = None, **kwargs: Any) LLMResult¶
Run the LLM on the given prompt and input.
- async agenerate_prompt(prompts: List[PromptValue], stop: Optional[List[str]] = None, callbacks: Union[List[BaseCallbackHandler], BaseCallbackManager, None, List[Optional[Union[List[BaseCallbackHandler], BaseCallbackManager]]]] = None, **kwargs: Any) LLMResult¶
Asynchronously pass a sequence of prompts and return model generations.
This method should make use of batched calls for models that expose a batched API.
- Use this method when you want to:
take advantage of batched calls,
need more output from the model than just the top generated value,
- are building chains that are agnostic to the underlying language model
type (e.g., pure text completion models vs chat models).
- Parameters
prompts – List of PromptValues. A PromptValue is an object that can be converted to match the format of any language model (string for pure text generation models and BaseMessages for chat models).
stop – Stop words to use when generating. Model output is cut off at the first occurrence of any of these substrings.
callbacks – Callbacks to pass through. Used for executing additional functionality, such as logging or streaming, throughout generation.
**kwargs – Arbitrary additional keyword arguments. These are usually passed to the model provider API call.
- Returns
- An LLMResult, which contains a list of candidate Generations for each input
prompt and additional model provider-specific output.
- async ainvoke(input: Union[PromptValue, str, List[BaseMessage]], config: Optional[RunnableConfig] = None, *, stop: Optional[List[str]] = None, **kwargs: Any) str¶
- async apredict(text: str, *, stop: Optional[Sequence[str]] = None, **kwargs: Any) str¶
Asynchronously pass a string to the model and return a string prediction.
- Use this method when calling pure text generation models and only the top
candidate generation is needed.
- Parameters
text – String input to pass to the model.
stop – Stop words to use when generating. Model output is cut off at the first occurrence of any of these substrings.
**kwargs – Arbitrary additional keyword arguments. These are usually passed to the model provider API call.
- Returns
Top model prediction as a string.
- async apredict_messages(messages: List[BaseMessage], *, stop: Optional[Sequence[str]] = None, **kwargs: Any) BaseMessage¶
Asynchronously pass messages to the model and return a message prediction.
- Use this method when calling chat models and only the top
candidate generation is needed.
- Parameters
messages – A sequence of chat messages corresponding to a single model input.
stop – Stop words to use when generating. Model output is cut off at the first occurrence of any of these substrings.
**kwargs – Arbitrary additional keyword arguments. These are usually passed to the model provider API call.
- Returns
Top model prediction as a message.
- async astream(input: Union[PromptValue, str, List[BaseMessage]], config: Optional[RunnableConfig] = None, *, stop: Optional[List[str]] = None, **kwargs: Any) AsyncIterator[str]¶
- batch(inputs: List[Union[PromptValue, str, List[BaseMessage]]], config: Optional[Union[RunnableConfig, List[RunnableConfig]]] = None, max_concurrency: Optional[int] = None, **kwargs: Any) List[str]¶
- dict(**kwargs: Any) Dict¶
Return a dictionary of the LLM.
- generate(prompts: List[str], stop: Optional[List[str]] = None, callbacks: Union[List[BaseCallbackHandler], BaseCallbackManager, None, List[Optional[Union[List[BaseCallbackHandler], BaseCallbackManager]]]] = None, *, tags: Optional[Union[List[str], List[List[str]]]] = None, metadata: Optional[Union[Dict[str, Any], List[Dict[str, Any]]]] = None, **kwargs: Any) LLMResult¶
Run the LLM on the given prompt and input.
- generate_prompt(prompts: List[PromptValue], stop: Optional[List[str]] = None, callbacks: Union[List[BaseCallbackHandler], BaseCallbackManager, None, List[Optional[Union[List[BaseCallbackHandler], BaseCallbackManager]]]] = None, **kwargs: Any) LLMResult¶
Pass a sequence of prompts to the model and return model generations.
This method should make use of batched calls for models that expose a batched API.
- Use this method when you want to:
take advantage of batched calls,
need more output from the model than just the top generated value,
- are building chains that are agnostic to the underlying language model
type (e.g., pure text completion models vs chat models).
- Parameters
prompts – List of PromptValues. A PromptValue is an object that can be converted to match the format of any language model (string for pure text generation models and BaseMessages for chat models).
stop – Stop words to use when generating. Model output is cut off at the first occurrence of any of these substrings.
callbacks – Callbacks to pass through. Used for executing additional functionality, such as logging or streaming, throughout generation.
**kwargs – Arbitrary additional keyword arguments. These are usually passed to the model provider API call.
- Returns
- An LLMResult, which contains a list of candidate Generations for each input
prompt and additional model provider-specific output.
- get_num_tokens(text: str) int¶
Get the number of tokens present in the text.
Useful for checking if an input will fit in a model’s context window.
- Parameters
text – The string input to tokenize.
- Returns
The integer number of tokens in the text.
- get_num_tokens_from_messages(messages: List[BaseMessage]) int¶
Get the number of tokens in the messages.
Useful for checking if an input will fit in a model’s context window.
- Parameters
messages – The message inputs to tokenize.
- Returns
The sum of the number of tokens across the messages.
- get_token_ids(text: str) List[int]¶
Return the ordered ids of the tokens in a text.
- Parameters
text – The string input to tokenize.
- Returns
- A list of ids corresponding to the tokens in the text, in order they occur
in the text.
- invoke(input: Union[PromptValue, str, List[BaseMessage]], config: Optional[RunnableConfig] = None, *, stop: Optional[List[str]] = None, **kwargs: Any) str¶
- predict(text: str, *, stop: Optional[Sequence[str]] = None, **kwargs: Any) str¶
Pass a single string input to the model and return a string prediction.
- Use this method when passing in raw text. If you want to pass in specific
types of chat messages, use predict_messages.
- Parameters
text – String input to pass to the model.
stop – Stop words to use when generating. Model output is cut off at the first occurrence of any of these substrings.
**kwargs – Arbitrary additional keyword arguments. These are usually passed to the model provider API call.
- Returns
Top model prediction as a string.
- predict_messages(messages: List[BaseMessage], *, stop: Optional[Sequence[str]] = None, **kwargs: Any) BaseMessage¶
Pass a message sequence to the model and return a message prediction.
- Use this method when passing in chat messages. If you want to pass in raw text,
use predict.
- Parameters
messages – A sequence of chat messages corresponding to a single model input.
stop – Stop words to use when generating. Model output is cut off at the first occurrence of any of these substrings.
**kwargs – Arbitrary additional keyword arguments. These are usually passed to the model provider API call.
- Returns
Top model prediction as a message.
- validator raise_deprecation » all fields¶
Raise deprecation warning if callback_manager is used.
- save(file_path: Union[Path, str]) None¶
Save the LLM.
- Parameters
file_path – Path to file to save the LLM to.
Example: .. code-block:: python
llm.save(file_path=”path/llm.yaml”)
- validator set_verbose » verbose¶
If verbose is None, set it.
This allows users to pass in None as verbose to access the global setting.
- stream(input: Union[PromptValue, str, List[BaseMessage]], config: Optional[RunnableConfig] = None, *, stop: Optional[List[str]] = None, **kwargs: Any) Iterator[str]¶
- to_json() Union[SerializedConstructor, SerializedNotImplemented]¶
- to_json_not_implemented() SerializedNotImplemented¶
- property lc_attributes: Dict¶
Return a list of attribute names that should be included in the serialized kwargs. These attributes must be accepted by the constructor.
- property lc_namespace: List[str]¶
Return the namespace of the langchain object. eg. [“langchain”, “llms”, “openai”]
- property lc_secrets: Dict[str, str]¶
Return a map of constructor argument names to secret ids. eg. {“openai_api_key”: “OPENAI_API_KEY”}
- property lc_serializable: bool¶
Return whether or not the class is serializable.