langchain.retrievers.tfidf.TFIDFRetriever

class langchain.retrievers.tfidf.TFIDFRetriever(*, tags: Optional[List[str]] = None, metadata: Optional[Dict[str, Any]] = None, vectorizer: Any = None, docs: List[Document], tfidf_array: Any = None, k: int = 4)[source]

Bases: BaseRetriever

TF-IDF Retriever.

Largely based on https://github.com/asvskartheek/Text-Retrieval/blob/master/TF-IDF%20Search%20Engine%20(SKLEARN).ipynb

Create a new model by parsing and validating input data from keyword arguments.

Raises ValidationError if the input data cannot be parsed to form a valid model.

param docs: List[langchain.schema.document.Document] [Required]

Documents.

param k: int = 4

Number of documents to return.

param metadata: Optional[Dict[str, Any]] = None

Optional metadata associated with the retriever. Defaults to None This metadata will be associated with each call to this retriever, and passed as arguments to the handlers defined in callbacks. You can use these to eg identify a specific instance of a retriever with its use case.

param tags: Optional[List[str]] = None

Optional list of tags associated with the retriever. Defaults to None These tags will be associated with each call to this retriever, and passed as arguments to the handlers defined in callbacks. You can use these to eg identify a specific instance of a retriever with its use case.

param tfidf_array: Any = None

TF-IDF array.

param vectorizer: Any = None

TF-IDF vectorizer.

async aget_relevant_documents(query: str, *, callbacks: Callbacks = None, tags: Optional[List[str]] = None, metadata: Optional[Dict[str, Any]] = None, **kwargs: Any) List[Document]

Asynchronously get documents relevant to a query. :param query: string to find relevant documents for :param callbacks: Callback manager or list of callbacks :param tags: Optional list of tags associated with the retriever. Defaults to None

These tags will be associated with each call to this retriever, and passed as arguments to the handlers defined in callbacks.

Parameters

metadata – Optional metadata associated with the retriever. Defaults to None This metadata will be associated with each call to this retriever, and passed as arguments to the handlers defined in callbacks.

Returns

List of relevant documents

async ainvoke(input: str, config: Optional[RunnableConfig] = None) List[Document]
classmethod from_documents(documents: Iterable[Document], *, tfidf_params: Optional[Dict[str, Any]] = None, **kwargs: Any) TFIDFRetriever[source]
classmethod from_texts(texts: Iterable[str], metadatas: Optional[Iterable[dict]] = None, tfidf_params: Optional[Dict[str, Any]] = None, **kwargs: Any) TFIDFRetriever[source]
get_relevant_documents(query: str, *, callbacks: Callbacks = None, tags: Optional[List[str]] = None, metadata: Optional[Dict[str, Any]] = None, **kwargs: Any) List[Document]

Retrieve documents relevant to a query. :param query: string to find relevant documents for :param callbacks: Callback manager or list of callbacks :param tags: Optional list of tags associated with the retriever. Defaults to None

These tags will be associated with each call to this retriever, and passed as arguments to the handlers defined in callbacks.

Parameters

metadata – Optional metadata associated with the retriever. Defaults to None This metadata will be associated with each call to this retriever, and passed as arguments to the handlers defined in callbacks.

Returns

List of relevant documents

invoke(input: str, config: Optional[RunnableConfig] = None) List[Document]
to_json() Union[SerializedConstructor, SerializedNotImplemented]
to_json_not_implemented() SerializedNotImplemented
property lc_attributes: Dict

Return a list of attribute names that should be included in the serialized kwargs. These attributes must be accepted by the constructor.

property lc_namespace: List[str]

Return the namespace of the langchain object. eg. [“langchain”, “llms”, “openai”]

property lc_secrets: Dict[str, str]

Return a map of constructor argument names to secret ids. eg. {“openai_api_key”: “OPENAI_API_KEY”}

property lc_serializable: bool

Return whether or not the class is serializable.

model Config[source]

Bases: object

Configuration for this pydantic object.

arbitrary_types_allowed = True

Examples using TFIDFRetriever