langchain.embeddings.huggingface.HuggingFaceInstructEmbeddings¶
- class langchain.embeddings.huggingface.HuggingFaceInstructEmbeddings(*, client: Any = None, model_name: str = 'hkunlp/instructor-large', cache_folder: Optional[str] = None, model_kwargs: Dict[str, Any] = None, encode_kwargs: Dict[str, Any] = None, embed_instruction: str = 'Represent the document for retrieval: ', query_instruction: str = 'Represent the question for retrieving supporting documents: ')[source]¶
Bases:
BaseModel,EmbeddingsWrapper around sentence_transformers embedding models.
To use, you should have the
sentence_transformersandInstructorEmbeddingpython packages installed.Example
from langchain.embeddings import HuggingFaceInstructEmbeddings model_name = "hkunlp/instructor-large" model_kwargs = {'device': 'cpu'} encode_kwargs = {'normalize_embeddings': True} hf = HuggingFaceInstructEmbeddings( model_name=model_name, model_kwargs=model_kwargs, encode_kwargs=encode_kwargs )
Initialize the sentence_transformer.
- param cache_folder: Optional[str] = None¶
Path to store models. Can be also set by SENTENCE_TRANSFORMERS_HOME environment variable.
- param embed_instruction: str = 'Represent the document for retrieval: '¶
Instruction to use for embedding documents.
- param encode_kwargs: Dict[str, Any] [Optional]¶
Key word arguments to pass when calling the encode method of the model.
- param model_kwargs: Dict[str, Any] [Optional]¶
Key word arguments to pass to the model.
- param model_name: str = 'hkunlp/instructor-large'¶
Model name to use.
- param query_instruction: str = 'Represent the question for retrieving supporting documents: '¶
Instruction to use for embedding query.
- embed_documents(texts: List[str]) List[List[float]][source]¶
Compute doc embeddings using a HuggingFace instruct model.
- Parameters
texts – The list of texts to embed.
- Returns
List of embeddings, one for each text.