langchain.document_loaders.browserless.BrowserlessLoader¶
- class langchain.document_loaders.browserless.BrowserlessLoader(api_token: str, urls: Union[str, List[str]], text_content: bool = True)[source]¶
Bases:
BaseLoaderLoads the content of webpages using Browserless’ /content endpoint
Initialize with API token and the URLs to scrape
Methods
__init__(api_token, urls[, text_content])Initialize with API token and the URLs to scrape
Lazy load Documents from URLs.
load()Load Documents from URLs.
load_and_split([text_splitter])Load Documents and split into chunks.
Attributes
Browserless API token.
List of URLs to scrape.
- load_and_split(text_splitter: Optional[TextSplitter] = None) List[Document]¶
Load Documents and split into chunks. Chunks are returned as Documents.
- Parameters
text_splitter – TextSplitter instance to use for splitting documents. Defaults to RecursiveCharacterTextSplitter.
- Returns
List of Documents.
- api_token¶
Browserless API token.
- urls¶
List of URLs to scrape.