`langchain.document_loaders.blackboard`.BlackboardLoader¶

class langchain.document_loaders.blackboard.BlackboardLoader(blackboard_course_url: str, bbrouter: str, load_all_recursively: bool = True, basic_auth: Optional[Tuple[str, str]] = None, cookies: Optional[dict] = None)[source]¶

Bases: WebBaseLoader

Loads all documents from a Blackboard course.

This loader is not compatible with all Blackboard courses. It is only compatible with courses that use the new Blackboard interface. To use this loader, you must have the BbRouter cookie. You can get this cookie by logging into the course and then copying the value of the BbRouter cookie from the browser’s developer tools.

Example

from langchain.document_loaders import BlackboardLoader

loader = BlackboardLoader(
    blackboard_course_url="https://blackboard.example.com/webapps/blackboard/execute/announcement?method=search&context=course_entry&course_id=_123456_1",
    bbrouter="expires:12345...",
)
documents = loader.load()

Initialize with blackboard course url.

The BbRouter cookie is required for most blackboard courses.

Parameters

blackboard_course_url – Blackboard course url.
bbrouter – BbRouter cookie.
load_all_recursively – If True, load all documents recursively.
basic_auth – Basic auth credentials.
cookies – Cookies.

Raises

ValueError – If blackboard course url is invalid.

Methods

`__init__`(blackboard_course_url, bbrouter[, ...])	Initialize with blackboard course url.
`aload`()	Load text from the urls in web_path async into Documents.
`check_bs4`()	Check if BeautifulSoup4 is installed.
`download`(path)	Download a file from an url.
`fetch_all`(urls)	Fetch all urls concurrently with rate limiting.
`lazy_load`()	Lazy load text from the url(s) in web_path.
`load`()	Load data into Document objects.
`load_and_split`([text_splitter])	Load Documents and split into chunks.
`parse_filename`(url)	Parse the filename from an url.
`scrape`([parser])	Scrape data from webpage and return it in BeautifulSoup format.
`scrape_all`(urls[, parser])	Fetch all urls, then return soups for all results.

Attributes

`bs_get_text_kwargs`	kwargs for beatifulsoup4 get_text
`default_parser`	Default parser to use for BeautifulSoup.
`raise_for_status`	Raise an exception if http status code denotes an error.
`requests_kwargs`	kwargs for requests
`requests_per_second`	Max number of concurrent requests to make.
`web_path`
`base_url`	Base url of the blackboard course.
`folder_path`	Path to the folder containing the documents.
`load_all_recursively`	If True, load all documents recursively.

aload() → List[Document]¶: Load text from the urls in web_path async into Documents.

check_bs4() → None[source]¶

Check if BeautifulSoup4 is installed.

Raises: ImportError – If BeautifulSoup4 is not installed.

download(path: str) → None[source]¶

Download a file from an url.

Parameters: path – Path to the file.

async fetch_all(urls: List[str]) → Any¶: Fetch all urls concurrently with rate limiting.

lazy_load() → Iterator[Document]¶: Lazy load text from the url(s) in web_path.

load() → List[Document][source]¶

Load data into Document objects.

Returns: List of Documents.

load_and_split(text_splitter: Optional[TextSplitter] = None) → List[Document]¶

Load Documents and split into chunks. Chunks are returned as Documents.

Parameters: text_splitter – TextSplitter instance to use for splitting documents. Defaults to RecursiveCharacterTextSplitter.
Returns: List of Documents.

parse_filename(url: str) → str[source]¶

Parse the filename from an url.

Parameters: url – Url to parse the filename from.
Returns: The filename.

scrape(parser: Optional[str] = None) → Any¶: Scrape data from webpage and return it in BeautifulSoup format.

scrape_all(urls: List[str], parser: Optional[str] = None) → List[Any]¶: Fetch all urls, then return soups for all results.

base_url: str¶: Base url of the blackboard course.

bs_get_text_kwargs: Dict[str, Any] = {}¶: kwargs for beatifulsoup4 get_text

default_parser: str = 'html.parser'¶: Default parser to use for BeautifulSoup.

folder_path: str¶: Path to the folder containing the documents.

load_all_recursively: bool¶: If True, load all documents recursively.

raise_for_status: bool = False¶: Raise an exception if http status code denotes an error.

requests_kwargs: Dict[str, Any] = {}¶: kwargs for requests

requests_per_second: int = 2¶: Max number of concurrent requests to make.

property web_path: str¶

web_paths: List[str]¶

Examples using BlackboardLoader¶

Blackboard

langchain.document_loaders.blackboard.BlackboardLoader¶

Examples using BlackboardLoader¶

`langchain.document_loaders.blackboard`.BlackboardLoader¶