langchain.retrievers.arxiv.ArxivRetriever¶

class langchain.retrievers.arxiv.ArxivRetriever(*, arxiv_search: Any = None, arxiv_exceptions: Any = None, top_k_results: int = 3, load_max_docs: int = 100, load_all_available_meta: bool = False, doc_content_chars_max: Optional[int] = 4000, ARXIV_MAX_QUERY_LENGTH: int = 300, tags: Optional[List[str]] = None, metadata: Optional[Dict[str, Any]] = None)[source]¶

Bases: BaseRetriever, ArxivAPIWrapper

Retriever for Arxiv.

It wraps load() to get_relevant_documents(). It uses all ArxivAPIWrapper arguments without any change.

Create a new model by parsing and validating input data from keyword arguments.

Raises ValidationError if the input data cannot be parsed to form a valid model.

param arxiv_exceptions: Any = None¶
param doc_content_chars_max: Optional[int] = 4000¶
param load_all_available_meta: bool = False¶
param load_max_docs: int = 100¶
param metadata: Optional[Dict[str, Any]] = None¶

Optional metadata associated with the retriever. Defaults to None This metadata will be associated with each call to this retriever, and passed as arguments to the handlers defined in callbacks. You can use these to eg identify a specific instance of a retriever with its use case.

param tags: Optional[List[str]] = None¶

Optional list of tags associated with the retriever. Defaults to None These tags will be associated with each call to this retriever, and passed as arguments to the handlers defined in callbacks. You can use these to eg identify a specific instance of a retriever with its use case.

param top_k_results: int = 3¶
async aget_relevant_documents(query: str, *, callbacks: Callbacks = None, tags: Optional[List[str]] = None, metadata: Optional[Dict[str, Any]] = None, **kwargs: Any) List[Document]¶

Asynchronously get documents relevant to a query. :param query: string to find relevant documents for :param callbacks: Callback manager or list of callbacks :param tags: Optional list of tags associated with the retriever. Defaults to None

These tags will be associated with each call to this retriever, and passed as arguments to the handlers defined in callbacks.

Parameters

metadata – Optional metadata associated with the retriever. Defaults to None This metadata will be associated with each call to this retriever, and passed as arguments to the handlers defined in callbacks.

Returns

List of relevant documents

async ainvoke(input: str, config: Optional[RunnableConfig] = None) List[Document]¶
get_relevant_documents(query: str, *, callbacks: Callbacks = None, tags: Optional[List[str]] = None, metadata: Optional[Dict[str, Any]] = None, **kwargs: Any) List[Document]¶

Retrieve documents relevant to a query. :param query: string to find relevant documents for :param callbacks: Callback manager or list of callbacks :param tags: Optional list of tags associated with the retriever. Defaults to None

These tags will be associated with each call to this retriever, and passed as arguments to the handlers defined in callbacks.

Parameters

metadata – Optional metadata associated with the retriever. Defaults to None This metadata will be associated with each call to this retriever, and passed as arguments to the handlers defined in callbacks.

Returns

List of relevant documents

invoke(input: str, config: Optional[RunnableConfig] = None) List[Document]¶
load(query: str) List[Document]¶

Run Arxiv search and get the article texts plus the article meta information. See https://lukasschwab.me/arxiv.py/index.html#Search

Returns: a list of documents with the document.page_content in text format

Performs an arxiv search, downloads the top k results as PDFs, loads them as Documents, and returns them in a List.

Parameters

query – a plaintext search query

run(query: str) str¶

Performs an arxiv search and A single string with the publish date, title, authors, and summary for each article separated by two newlines.

If an error occurs or no documents found, error text is returned instead. Wrapper for https://lukasschwab.me/arxiv.py/index.html#Search

Parameters

query – a plaintext search query

to_json() Union[SerializedConstructor, SerializedNotImplemented]¶
to_json_not_implemented() SerializedNotImplemented¶
validator validate_environment  »  all fields¶

Validate that the python package exists in environment.

property lc_attributes: Dict¶

Return a list of attribute names that should be included in the serialized kwargs. These attributes must be accepted by the constructor.

property lc_namespace: List[str]¶

Return the namespace of the langchain object. eg. [“langchain”, “llms”, “openai”]

property lc_secrets: Dict[str, str]¶

Return a map of constructor argument names to secret ids. eg. {“openai_api_key”: “OPENAI_API_KEY”}

property lc_serializable: bool¶

Return whether or not the class is serializable.

model Config¶

Bases: object

Configuration for this pydantic object.

arbitrary_types_allowed = True¶

Examples using ArxivRetriever¶