langchain.llms.rwkv.RWKV¶
- class langchain.llms.rwkv.RWKV(*, cache: Optional[bool] = None, verbose: bool = None, callbacks: Optional[Union[List[BaseCallbackHandler], BaseCallbackManager]] = None, callback_manager: Optional[BaseCallbackManager] = None, tags: Optional[List[str]] = None, metadata: Optional[Dict[str, Any]] = None, model: str, tokens_path: str, strategy: str = 'cpu fp32', rwkv_verbose: bool = True, temperature: float = 1.0, top_p: float = 0.5, penalty_alpha_frequency: float = 0.4, penalty_alpha_presence: float = 0.4, CHUNK_LEN: int = 256, max_tokens_per_generation: int = 256, client: Any = None, tokenizer: Any = None, pipeline: Any = None, model_tokens: Any = None, model_state: Any = None)[source]¶
Bases:
LLM,BaseModelRWKV language models.
To use, you should have the
rwkvpython package installed, the pre-trained model file, and the model’s config information.Example
from langchain.llms import RWKV model = RWKV(model="./models/rwkv-3b-fp16.bin", strategy="cpu fp32") # Simplest invocation response = model("Once upon a time, ")
Create a new model by parsing and validating input data from keyword arguments.
Raises ValidationError if the input data cannot be parsed to form a valid model.
- param CHUNK_LEN: int = 256¶
Batch size for prompt processing.
- param cache: Optional[bool] = None¶
- param callback_manager: Optional[BaseCallbackManager] = None¶
- param callbacks: Callbacks = None¶
- param max_tokens_per_generation: int = 256¶
Maximum number of tokens to generate.
- param metadata: Optional[Dict[str, Any]] = None¶
Metadata to add to the run trace.
- param model: str [Required]¶
Path to the pre-trained RWKV model file.
- param penalty_alpha_frequency: float = 0.4¶
Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model’s likelihood to repeat the same line verbatim..
- param penalty_alpha_presence: float = 0.4¶
Positive values penalize new tokens based on whether they appear in the text so far, increasing the model’s likelihood to talk about new topics..
- param rwkv_verbose: bool = True¶
Print debug information.
- param strategy: str = 'cpu fp32'¶
Token context window.
- param tags: Optional[List[str]] = None¶
Tags to add to the run trace.
- param temperature: float = 1.0¶
The temperature to use for sampling.
- param tokens_path: str [Required]¶
Path to the RWKV tokens file.
- param top_p: float = 0.5¶
The top-p value to use for sampling.
- param verbose: bool [Optional]¶
Whether to print out response text.
- __call__(prompt: str, stop: Optional[List[str]] = None, callbacks: Optional[Union[List[BaseCallbackHandler], BaseCallbackManager]] = None, *, tags: Optional[List[str]] = None, metadata: Optional[Dict[str, Any]] = None, **kwargs: Any) str¶
Check Cache and run the LLM on the given prompt and input.
- async abatch(inputs: List[Union[PromptValue, str, List[BaseMessage]]], config: Optional[Union[RunnableConfig, List[RunnableConfig]]] = None, max_concurrency: Optional[int] = None, **kwargs: Any) List[str]¶
- async agenerate(prompts: List[str], stop: Optional[List[str]] = None, callbacks: Union[List[BaseCallbackHandler], BaseCallbackManager, None, List[Optional[Union[List[BaseCallbackHandler], BaseCallbackManager]]]] = None, *, tags: Optional[Union[List[str], List[List[str]]]] = None, metadata: Optional[Union[Dict[str, Any], List[Dict[str, Any]]]] = None, **kwargs: Any) LLMResult¶
Run the LLM on the given prompt and input.
- async agenerate_prompt(prompts: List[PromptValue], stop: Optional[List[str]] = None, callbacks: Union[List[BaseCallbackHandler], BaseCallbackManager, None, List[Optional[Union[List[BaseCallbackHandler], BaseCallbackManager]]]] = None, **kwargs: Any) LLMResult¶
Asynchronously pass a sequence of prompts and return model generations.
This method should make use of batched calls for models that expose a batched API.
- Use this method when you want to:
take advantage of batched calls,
need more output from the model than just the top generated value,
- are building chains that are agnostic to the underlying language model
type (e.g., pure text completion models vs chat models).
- Parameters
prompts – List of PromptValues. A PromptValue is an object that can be converted to match the format of any language model (string for pure text generation models and BaseMessages for chat models).
stop – Stop words to use when generating. Model output is cut off at the first occurrence of any of these substrings.
callbacks – Callbacks to pass through. Used for executing additional functionality, such as logging or streaming, throughout generation.
**kwargs – Arbitrary additional keyword arguments. These are usually passed to the model provider API call.
- Returns
- An LLMResult, which contains a list of candidate Generations for each input
prompt and additional model provider-specific output.
- async ainvoke(input: Union[PromptValue, str, List[BaseMessage]], config: Optional[RunnableConfig] = None, *, stop: Optional[List[str]] = None, **kwargs: Any) str¶
- async apredict(text: str, *, stop: Optional[Sequence[str]] = None, **kwargs: Any) str¶
Asynchronously pass a string to the model and return a string prediction.
- Use this method when calling pure text generation models and only the top
candidate generation is needed.
- Parameters
text – String input to pass to the model.
stop – Stop words to use when generating. Model output is cut off at the first occurrence of any of these substrings.
**kwargs – Arbitrary additional keyword arguments. These are usually passed to the model provider API call.
- Returns
Top model prediction as a string.
- async apredict_messages(messages: List[BaseMessage], *, stop: Optional[Sequence[str]] = None, **kwargs: Any) BaseMessage¶
Asynchronously pass messages to the model and return a message prediction.
- Use this method when calling chat models and only the top
candidate generation is needed.
- Parameters
messages – A sequence of chat messages corresponding to a single model input.
stop – Stop words to use when generating. Model output is cut off at the first occurrence of any of these substrings.
**kwargs – Arbitrary additional keyword arguments. These are usually passed to the model provider API call.
- Returns
Top model prediction as a message.
- async astream(input: Union[PromptValue, str, List[BaseMessage]], config: Optional[RunnableConfig] = None, *, stop: Optional[List[str]] = None, **kwargs: Any) AsyncIterator[str]¶
- batch(inputs: List[Union[PromptValue, str, List[BaseMessage]]], config: Optional[Union[RunnableConfig, List[RunnableConfig]]] = None, max_concurrency: Optional[int] = None, **kwargs: Any) List[str]¶
- dict(**kwargs: Any) Dict¶
Return a dictionary of the LLM.
- generate(prompts: List[str], stop: Optional[List[str]] = None, callbacks: Union[List[BaseCallbackHandler], BaseCallbackManager, None, List[Optional[Union[List[BaseCallbackHandler], BaseCallbackManager]]]] = None, *, tags: Optional[Union[List[str], List[List[str]]]] = None, metadata: Optional[Union[Dict[str, Any], List[Dict[str, Any]]]] = None, **kwargs: Any) LLMResult¶
Run the LLM on the given prompt and input.
- generate_prompt(prompts: List[PromptValue], stop: Optional[List[str]] = None, callbacks: Union[List[BaseCallbackHandler], BaseCallbackManager, None, List[Optional[Union[List[BaseCallbackHandler], BaseCallbackManager]]]] = None, **kwargs: Any) LLMResult¶
Pass a sequence of prompts to the model and return model generations.
This method should make use of batched calls for models that expose a batched API.
- Use this method when you want to:
take advantage of batched calls,
need more output from the model than just the top generated value,
- are building chains that are agnostic to the underlying language model
type (e.g., pure text completion models vs chat models).
- Parameters
prompts – List of PromptValues. A PromptValue is an object that can be converted to match the format of any language model (string for pure text generation models and BaseMessages for chat models).
stop – Stop words to use when generating. Model output is cut off at the first occurrence of any of these substrings.
callbacks – Callbacks to pass through. Used for executing additional functionality, such as logging or streaming, throughout generation.
**kwargs – Arbitrary additional keyword arguments. These are usually passed to the model provider API call.
- Returns
- An LLMResult, which contains a list of candidate Generations for each input
prompt and additional model provider-specific output.
- get_num_tokens(text: str) int¶
Get the number of tokens present in the text.
Useful for checking if an input will fit in a model’s context window.
- Parameters
text – The string input to tokenize.
- Returns
The integer number of tokens in the text.
- get_num_tokens_from_messages(messages: List[BaseMessage]) int¶
Get the number of tokens in the messages.
Useful for checking if an input will fit in a model’s context window.
- Parameters
messages – The message inputs to tokenize.
- Returns
The sum of the number of tokens across the messages.
- get_token_ids(text: str) List[int]¶
Return the ordered ids of the tokens in a text.
- Parameters
text – The string input to tokenize.
- Returns
- A list of ids corresponding to the tokens in the text, in order they occur
in the text.
- invoke(input: Union[PromptValue, str, List[BaseMessage]], config: Optional[RunnableConfig] = None, *, stop: Optional[List[str]] = None, **kwargs: Any) str¶
- predict(text: str, *, stop: Optional[Sequence[str]] = None, **kwargs: Any) str¶
Pass a single string input to the model and return a string prediction.
- Use this method when passing in raw text. If you want to pass in specific
types of chat messages, use predict_messages.
- Parameters
text – String input to pass to the model.
stop – Stop words to use when generating. Model output is cut off at the first occurrence of any of these substrings.
**kwargs – Arbitrary additional keyword arguments. These are usually passed to the model provider API call.
- Returns
Top model prediction as a string.
- predict_messages(messages: List[BaseMessage], *, stop: Optional[Sequence[str]] = None, **kwargs: Any) BaseMessage¶
Pass a message sequence to the model and return a message prediction.
- Use this method when passing in chat messages. If you want to pass in raw text,
use predict.
- Parameters
messages – A sequence of chat messages corresponding to a single model input.
stop – Stop words to use when generating. Model output is cut off at the first occurrence of any of these substrings.
**kwargs – Arbitrary additional keyword arguments. These are usually passed to the model provider API call.
- Returns
Top model prediction as a message.
- validator raise_deprecation » all fields¶
Raise deprecation warning if callback_manager is used.
- save(file_path: Union[Path, str]) None¶
Save the LLM.
- Parameters
file_path – Path to file to save the LLM to.
Example: .. code-block:: python
llm.save(file_path=”path/llm.yaml”)
- validator set_verbose » verbose¶
If verbose is None, set it.
This allows users to pass in None as verbose to access the global setting.
- stream(input: Union[PromptValue, str, List[BaseMessage]], config: Optional[RunnableConfig] = None, *, stop: Optional[List[str]] = None, **kwargs: Any) Iterator[str]¶
- to_json() Union[SerializedConstructor, SerializedNotImplemented]¶
- to_json_not_implemented() SerializedNotImplemented¶
- validator validate_environment » all fields[source]¶
Validate that the python package exists in the environment.
- property lc_attributes: Dict¶
Return a list of attribute names that should be included in the serialized kwargs. These attributes must be accepted by the constructor.
- property lc_namespace: List[str]¶
Return the namespace of the langchain object. eg. [“langchain”, “llms”, “openai”]
- property lc_secrets: Dict[str, str]¶
Return a map of constructor argument names to secret ids. eg. {“openai_api_key”: “OPENAI_API_KEY”}
- property lc_serializable: bool¶
Return whether or not the class is serializable.