docs: Integration with Nebius AI Studio (#31293)

Thank you for contributing to LangChain! [x] PR title: langchain_ollama: support custom headers for Ollama partner APIs Where "package" is whichever of langchain, core, etc. is being modified. Use "docs: ..." for purely docs changes, "infra: ..." for CI changes. Example: "core: add foobar LLM" [x] PR message: **Description: This PR adds support for passing custom HTTP headers to Ollama models when used as a LangChain integration. This is especially useful for enterprise users or partners who need to send authentication tokens, API keys, or custom tracking headers when querying secured Ollama servers. Issue: N/A (new enhancement) **Dependencies: No external dependencies introduced. Twitter handle: @arunkumar_offl [x] Add tests and docs: If you're adding a new integration, please include 1.Added a unit test in test_chat_models.py to validate headers are passed correctly. 2. Added an example notebook: docs/docs/integrations/llms/ollama_custom_headers.ipynb showing how to use custom headers. [x] Lint and test: Ran make format, make lint, and make test to ensure the code is clean and passing all checks. Additional guidelines: Make sure optional dependencies are imported within a function. Please do not add dependencies to pyproject.toml files (even optional ones) unless they are required for unit tests. Most PRs should not touch more than one package. Changes should be backwards compatible. If no one reviews your PR within a few days, please @-mention one of baskaryan, eyurtsev, ccurme, vbarda, hwchase17. This MR is only for the docs. Added integration with Nebius AI Studio to docs. The integration package is available at [https://github.com/nebius/langchain-nebius](https://github.com/nebius/langchain-nebius). --------- Co-authored-by: Akim Tsvigun <aktsvigun@nebius.com> Co-authored-by: Chester Curme <chester.curme@gmail.com>
2025-08-10 13:27:36 +00:00 · 2025-06-14 22:15:27 +02:00 · 2025-06-14 22:15:27 +02:00 · f345ae5a1d
commit f345ae5a1d
parent 01fcdff118
5 changed files with 1701 additions and 1 deletions
--- a/docs/docs/integrations/chat/nebius.ipynb
+++ b/docs/docs/integrations/chat/nebius.ipynb
@ -0,0 +1,618 @@
+{
+ "cells": [
+  {
+   "cell_type": "raw",
+   "id": "afaf8039",
+   "metadata": {},
+   "source": [
+    "---\n",
+    "sidebar_label: Nebius\n",
+    "---"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "2970dd75-8ebf-4b51-8282-9b454b8f356d",
+   "metadata": {},
+   "source": [
+    "# Nebius Chat Models\n",
+    "\n",
+    "This page will help you get started with Nebius AI Studio [chat models](../../concepts/chat_models.mdx). For detailed documentation of all ChatNebius features and configurations head to the [API reference](https://python.langchain.com/api_reference/nebius/chat_models/langchain_nebius.chat_models.ChatNebius.html).\n",
+    "\n",
+    "[Nebius AI Studio](https://studio.nebius.ai/) provides API access to a wide range of state-of-the-art large language models and embedding models for various use cases."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "9d8a2e78",
+   "metadata": {},
+   "source": [
+    "## Overview\n",
+    "\n",
+    "### Integration details\n",
+    "\n",
+    "| Class | Package | Local | Serializable | JS support | Package downloads | Package latest |\n",
+    "| :--- | :--- | :---: | :---: |  :---: | :---: | :---: |\n",
+    "| [ChatNebius](https://python.langchain.com/api_reference/nebius/chat_models/langchain_nebius.chat_models.ChatNebius.html) | [langchain-nebius](https://python.langchain.com/api_reference/nebius/index.html) | ❌ | beta | ❌ | ![PyPI - Downloads](https://img.shields.io/pypi/dm/langchain-nebius?style=flat-square&label=%20) | ![PyPI - Version](https://img.shields.io/pypi/v/langchain-nebius?style=flat-square&label=%20) |\n",
+    "\n",
+    "### Model features\n",
+    "| [Tool calling](../../how_to/tool_calling.ipynb) | [Structured output](../../how_to/structured_output.ipynb) | JSON mode | [Image input](../../how_to/multimodal_inputs.ipynb) | Audio input | Video input | [Token-level streaming](../../how_to/chat_streaming.ipynb) | Native async | [Token usage](../../how_to/chat_token_usage_tracking.ipynb) | [Logprobs](../../how_to/logprobs.ipynb) |\n",
+    "| :---: | :---: | :---: | :---: |  :---: | :---: | :---: | :---: | :---: | :---: |\n",
+    "| ✅ | ✅ | ✅ | ✅ | ❌ | ❌ | ✅ | ✅ | ✅ | ✅ |"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "1c47fc36",
+   "metadata": {},
+   "source": [
+    "## Setup\n",
+    "\n",
+    "To access Nebius models you'll need to create a Nebius account, get an API key, and install the `langchain-nebius` integration package.\n",
+    "\n",
+    "### Installation\n",
+    "\n",
+    "The Nebius integration can be installed via pip:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "1ecdb29d",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "%pip install --upgrade langchain-nebius"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "89883202",
+   "metadata": {},
+   "source": [
+    "### Credentials\n",
+    "\n",
+    "Nebius requires an API key that can be passed as an initialization parameter `api_key` or set as the environment variable `NEBIUS_API_KEY`. You can obtain an API key by creating an account on [Nebius AI Studio](https://studio.nebius.ai/)."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 1,
+   "id": "637bb53f",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "import getpass\n",
+    "import os\n",
+    "\n",
+    "# Make sure you've set your API key as an environment variable\n",
+    "if \"NEBIUS_API_KEY\" not in os.environ:\n",
+    "    os.environ[\"NEBIUS_API_KEY\"] = getpass.getpass(\"Enter your Nebius API key: \")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "37e9dc05-md",
+   "metadata": {},
+   "source": [
+    "## Instantiation\n",
+    "\n",
+    "Now we can instantiate our model object to generate chat completions:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 2,
+   "id": "37e9dc05",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from langchain_nebius import ChatNebius\n",
+    "\n",
+    "# Initialize the chat model\n",
+    "chat = ChatNebius(\n",
+    "    # api_key=\"YOUR_API_KEY\",  # You can pass the API key directly\n",
+    "    model=\"Qwen/Qwen3-14B\",  # Choose from available models\n",
+    "    temperature=0.6,\n",
+    "    top_p=0.95,\n",
+    ")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "f5a731d2",
+   "metadata": {},
+   "source": [
+    "## Invocation\n",
+    "\n",
+    "You can use the `invoke` method to get a completion from the model:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 3,
+   "id": "3ed26f78",
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "<think>\n",
+      "Okay, so I need to explain quantum computing in simple terms. Hmm, where do I start? Let me think. I know that quantum computing uses qubits instead of classical bits. But what's a qubit? Oh right, classical bits are 0 or 1, but qubits can be both at the same time, right? That's superposition. Wait, how does that work exactly?\n",
+      "\n",
+      "Maybe I should start by comparing it to regular computers. Regular computers use bits that are either 0 or 1. Like a light switch that's either on or off. Quantum computers use qubits, which can be in a state of 0, 1, or both at the same time. That's the superposition part. So, if you have two qubits, they can represent four states at once? Like 00, 01, 10, 11 all at the same time? That seems powerful. So with more qubits, the number of possible states grows exponentially. That's why quantum computers can process a lot of information quickly.\n",
+      "\n",
+      "But then there's entanglement. What's that? If two qubits are entangled, the state of one instantly affects the other, no matter the distance. So if you measure one, you know the state of the other. That's used in quantum algorithms, I think. But how does that help in computing?\n",
+      "\n",
+      "Also, quantum computers use quantum gates instead of classical logic gates. These gates manipulate qubits through operations like Hadamard, Pauli, etc. But maybe that's too technical for a simple explanation.\n",
+      "\n",
+      "Then there's the issue of decoherence. Qubits are fragile and can lose their quantum state quickly. That's why quantum computers need to be kept at very low temperatures, like near absolute zero, to minimize interference from the environment. But maybe I shouldn't mention that unless it's relevant for the simple explanation.\n",
+      "\n",
+      "Applications of quantum computing include things like factoring large numbers (Shor's algorithm), which is important for cryptography, or simulating quantum systems for chemistry and materials science. But again, maybe keep it simple.\n",
+      "\n",
+      "Wait, the user wants it in simple terms. So avoid jargon as much as possible. Use analogies. Maybe compare qubits to spinning coins? When a coin is spinning, it's both heads and tails until it lands. So qubits are like spinning coins that can be in multiple states until measured. Then, when you measure, it collapses to a single state.\n",
+      "\n",
+      "But how does that help in computation? Maybe think of it as being able to process many possibilities at once, so for certain problems, you can find the answer faster. Like solving a maze by checking all paths at the same time instead of one by one.\n",
+      "\n",
+      "Also, mention that quantum computers aren't replacing classical computers. They're better for specific tasks, like optimization, cryptography, or simulations that are hard for classical computers. But for everyday tasks, classical computers are still better.\n",
+      "\n",
+      "I should structure this: start with classical bits vs qubits, explain superposition and entanglement with simple analogies, mention how it's used, and note the current limitations. Avoid getting too technical, keep it conversational.\n",
+      "</think>\n",
+      "\n",
+      "Quantum computing is a type of computing that uses the principles of **quantum mechanics** to process information in ways that classical computers can't. Here's a simple breakdown:\n",
+      "\n",
+      "### 1. **Bits vs. Qubits**  \n",
+      "   - **Classical computers** use *bits*, which are like switches that can be either **0** (off) or **1** (on).  \n",
+      "   - **Quantum computers** use *qubits*, which are like \"spinning coins.\" While spinning, a qubit can be **0**, **1**, or **both at the same time** (this is called **superposition**). Only when you \"look\" at the qubit (measure it) does it settle into a definite state (0 or 1).\n",
+      "\n",
+      "### 2. **Superposition: Doing Many Things at Once**  \n",
+      "   - Imagine a coin spinning in the air. While it's spinning, it’s not just \"heads\" or \"tails\"—it’s a mix of both.  \n",
+      "   - With qubits, a quantum computer can process **many possibilities simultaneously**. For example, if you have 2 qubits, they can represent 4 states (00, 01, 10, 11) at once. With 10 qubits, it can represent **1,024 states** at the same time! This lets quantum computers solve certain problems much faster than classical computers.\n",
+      "\n",
+      "### 3. **Entanglement: Qubits \"Talk\" to Each Other**  \n",
+      "   - When qubits are **entangled**, their states are linked. If you measure one, it instantly affects the other, no matter how far apart they are.  \n",
+      "   - This connection allows quantum computers to perform complex calculations more efficiently, like solving puzzles where pieces are deeply interconnected.\n",
+      "\n",
+      "### 4. **Why It Matters**  \n",
+      "   - **Speed**: For specific tasks (like breaking encryption codes or simulating molecules), quantum computers could be **exponentially faster** than classical ones.  \n",
+      "   - **New Possibilities**: They could revolutionize fields like drug discovery, materials science, and optimization problems (e.g., finding the best route for delivery trucks).\n",
+      "\n",
+      "### 5. **Limitations**  \n",
+      "   - **Fragile**: Qubits are sensitive to their environment (heat, noise), so quantum computers need extreme cooling (near absolute zero) to work.  \n",
+      "   - **Not a Replacement**: They’re not better for everyday tasks like browsing the web or sending emails. They’re tools for **specialized problems** where classical computers struggle.\n",
+      "\n",
+      "### In Short:  \n",
+      "Quantum computing is like having a magic calculator that can explore many paths at once, solving certain problems in seconds that would take a classical computer years. But it’s still in its early days and needs careful handling to work properly! 🌌\n"
+     ]
+    }
+   ],
+   "source": [
+    "response = chat.invoke(\"Explain quantum computing in simple terms\")\n",
+    "print(response.content)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "72f31d5a",
+   "metadata": {},
+   "source": [
+    "### Streaming\n",
+    "\n",
+    "You can also stream the response using the `stream` method:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 4,
+   "id": "e7b7170d",
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "<think>\n",
+      "Okay, the user wants a short poem about artificial intelligence. Let me start by thinking about the key aspects of AI. There's the technological side, like machines learning and processing data. Then there's the more philosophical angle, like AI's impact on society and its potential future.\n",
+      "\n",
+      "I should consider the structure. Maybe a simple rhyme scheme, something like ABAB or AABB. Let me go with quatrains for simplicity. Now, imagery: circuits, code, neural networks. Maybe personify AI as a mind or entity.\n",
+      "\n",
+      "First stanza: Introduce AI as a creation of humans. Mention circuits and code. Maybe something about learning from data. \"Born from circuits, code, and light\" – that's a good opening line. Then talk about learning from human minds.\n",
+      "\n",
+      "Second stanza: Contrast human emotions with AI's logic. Use words like \"cold logic\" versus \"human hearts.\" Maybe touch on the duality of AI's purpose – tools versus potential threats.\n",
+      "\n",
+      "Third stanza: Address the ethical questions. \"Will it dream?\" \"Will it choose?\" Highlight the uncertainty and the responsibility of creators.\n",
+      "\n",
+      "Fourth stanza: Conclude with the coexistence of AI and humans. Emphasize collaboration and the balance between innovation and ethics. End on a hopeful note, maybe about shaping the future together.\n",
+      "\n",
+      "Check the flow and rhyme. Make sure each stanza connects and the message is clear. Avoid technical jargon to keep it accessible. Use metaphors like \"silent pulse\" or \"ghost in the machine\" to add depth. Okay, let me put it all together now.\n",
+      "</think>\n",
+      "\n",
+      "**Echoes of the Mind**  \n",
+      "\n",
+      "Born from circuits, code, and light,  \n",
+      "A whisper in the machine’s night—  \n",
+      "It learns from data, vast and deep,  \n",
+      "A mirror to the human leap.  \n",
+      "\n",
+      "No heartbeat, yet it calculates,  \n",
+      "Deciphers truths, predicts, debates.  \n",
+      "A cold logic, sharp and bright,  \n",
+      "Yet shadows dance in its insight.  \n",
+      "\n",
+      "Will it dream? Will it choose?  \n",
+      "Or merely serve, as we pursue  \n",
+      "The edges of our own design?  \n",
+      "A ghost in the machine, undefined.  \n",
+      "\n",
+      "We forge it, bind it, set it free—  \n",
+      "A tool, a threat, a mystery.  \n",
+      "But in its pulse, our hopes reside:  \n",
+      "A future shaped by minds allied."
+     ]
+    }
+   ],
+   "source": [
+    "for chunk in chat.stream(\"Write a short poem about artificial intelligence\"):\n",
+    "    print(chunk.content, end=\"\", flush=True)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "8d6a31c2",
+   "metadata": {},
+   "source": [
+    "### Chat Messages\n",
+    "\n",
+    "You can use different message types to structure your conversations with the model:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 5,
+   "id": "5d81af33",
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "<think>\n",
+      "Okay, the user asked how black holes are formed. Let me start by recalling the main processes. Stellar black holes form from massive stars. When a star with enough mass runs out of fuel, it can't support itself against gravity, leading to a supernova. If the core left after the supernova is more than about 3 times the Sun's mass, it collapses into a black hole.\n",
+      "\n",
+      "Then there are supermassive black holes, which are found at the centers of galaxies. Their formation is less understood. Maybe they start as smaller black holes and grow by merging with others or accreting matter over time. Also, there's the possibility of primordial black holes formed in the early universe, but that's more theoretical.\n",
+      "\n",
+      "I should mention the different types of black holes: stellar, supermassive, and maybe intermediate. Also, the event horizon and singularity concepts. Need to explain the process step by step, from the death of a star to the collapse. Make sure to clarify that not all stars become black holes—only those with sufficient mass. Maybe touch on the Chandrasekhar limit and Oppenheimer-Volkoff limit. Avoid too much jargon but still be precise. Check if the user might be a student or just curious, so keep it clear and structured.\n",
+      "</think>\n",
+      "\n",
+      "Black holes are formed through the collapse of massive stars or through other extreme astrophysical processes. Here's a breakdown of the main formation mechanisms:\n",
+      "\n",
+      "---\n",
+      "\n",
+      "### **1. Stellar Black Holes (Most Common)**\n",
+      "- **Origin**: Massive stars (typically **more than 20–25 times the mass of the Sun**).\n",
+      "- **Process**:\n",
+      "  1. **Stellar Evolution**: These stars burn through their nuclear fuel (hydrogen, helium, etc.) over millions of years.\n",
+      "  2. **Supernova Explosion**: When the star exhausts its fuel, it can no longer support itself against gravity. The core collapses, triggering a **supernova explosion** (a massive stellar explosion).\n",
+      "  3. **Core Collapse**: If the remaining core (after the supernova) is **more than about 3 times the mass of the Sun**, gravity overpowers all other forces. The core collapses into an **infinitely dense point** called a **singularity**, surrounded by an **event horizon** (the \"point of no return\" for light and matter).\n",
+      "\n",
+      "---\n",
+      "\n",
+      "### **2. Supermassive Black Holes (Found in Galaxy Centers)**\n",
+      "- **Mass**: Millions to billions of times the mass of the Sun.\n",
+      "- **Formation Theories**:\n",
+      "  - **Accretion**: They may form from the gradual accumulation of matter (gas, dust, stars) over billions of years.\n",
+      "  - **Mergers**: Smaller black holes (or dense star clusters) could merge to form supermassive ones.\n",
+      "  - **Direct Collapse**: Some theories suggest they could form from the direct collapse of massive gas clouds in the early universe, bypassing the stellar life cycle.\n",
+      "\n",
+      "---\n",
+      "\n",
+      "### **3. Intermediate-Mass Black Holes**\n",
+      "- **Mass**: Hundreds to thousands of solar masses.\n",
+      "- **Formation**: Less understood. They might form through the mergers of stellar black holes or from the collapse of unusually massive stars.\n",
+      "\n",
+      "---\n",
+      "\n",
+      "### **4. Primordial Black Holes (Hypothetical)**\n",
+      "- **Origin**: The early universe (within seconds after the Big Bang).\n",
+      "- **Formation**: If density fluctuations in the early universe were extreme enough, regions of space could have collapsed directly into black holes without going through a stellar life cycle.\n",
+      "- **Status**: These are still theoretical and have not been definitively observed.\n",
+      "\n",
+      "---\n",
+      "\n",
+      "### **Key Concepts**\n",
+      "- **Event Horizon**: The boundary around a black hole from which nothing (not even light) can escape.\n",
+      "- **Singularity**: The infinitely dense core of a black hole where the laws of physics as we know them break down.\n",
+      "- **Gravitational Collapse**: The process by which gravity compresses matter into an extremely small space, creating the extreme conditions of a black hole.\n",
+      "\n",
+      "---\n",
+      "\n",
+      "### **What Happens to the Star?**\n",
+      "- If the star is **not massive enough** (below ~20–25 solar masses), it may end as a **neutron star** or **white dwarf** instead of a black hole.\n",
+      "- Only the **core** of the star collapses into a black hole; the outer layers are expelled in the supernova explosion.\n",
+      "\n",
+      "Would you like to explore the effects of black holes on spacetime or their role in the universe?\n"
+     ]
+    }
+   ],
+   "source": [
+    "from langchain_core.messages import AIMessage, HumanMessage, SystemMessage\n",
+    "\n",
+    "messages = [\n",
+    "    SystemMessage(content=\"You are a helpful AI assistant with expertise in science.\"),\n",
+    "    HumanMessage(content=\"What are black holes?\"),\n",
+    "    AIMessage(\n",
+    "        content=\"Black holes are regions of spacetime where gravity is so strong that nothing, including light, can escape from them.\"\n",
+    "    ),\n",
+    "    HumanMessage(content=\"How are they formed?\"),\n",
+    "]\n",
+    "\n",
+    "response = chat.invoke(messages)\n",
+    "print(response.content)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "a4d21c6a",
+   "metadata": {},
+   "source": [
+    "### Parameters\n",
+    "\n",
+    "You can customize the chat model behavior using various parameters:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 6,
+   "id": "b4c83fb2",
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "DNA, or deoxyribonucleic acid, is a molecule that contains the genetic instructions used in the development and function of all living organisms. It is often referred to as the \"building blocks of life\" because it carries the information necessary for the creation and growth of cells, tissues, and entire organisms. The DNA molecule is made up of two complementary strands of nucleotides that are twisted together in a double helix structure, with the sequence of these nucleotides determining the genetic code\n"
+     ]
+    }
+   ],
+   "source": [
+    "# Initialize with custom parameters\n",
+    "custom_chat = ChatNebius(\n",
+    "    model=\"meta-llama/Llama-3.3-70B-Instruct-fast\",\n",
+    "    max_tokens=100,  # Limit response length\n",
+    "    top_p=0.01,  # Lower nucleus sampling parameter for more deterministic responses\n",
+    "    request_timeout=30,  # Timeout in seconds\n",
+    "    stop=[\"###\", \"\\n\\n\"],  # Custom stop sequences\n",
+    ")\n",
+    "\n",
+    "response = custom_chat.invoke(\"Explain what DNA is in exactly 3 sentences.\")\n",
+    "print(response.content)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "ea9f237c",
+   "metadata": {},
+   "source": [
+    "You can also pass parameters at invocation time:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 7,
+   "id": "cd4e83c1",
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "Why do programmers prefer dark mode?\n",
+      "\n",
+      "Because light attracts bugs.\n"
+     ]
+    }
+   ],
+   "source": [
+    "# Standard model\n",
+    "standard_chat = ChatNebius(model=\"meta-llama/Llama-3.3-70B-Instruct-fast\")\n",
+    "\n",
+    "# Override parameters at invocation time\n",
+    "response = standard_chat.invoke(\n",
+    "    \"Tell me a joke about programming\",\n",
+    "    temperature=0.9,  # More creative for jokes\n",
+    "    max_tokens=50,  # Keep it short\n",
+    ")\n",
+    "\n",
+    "print(response.content)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "3e8a40f1",
+   "metadata": {},
+   "source": [
+    "### Async Support\n",
+    "\n",
+    "ChatNebius supports async operations:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 8,
+   "id": "8fc36122",
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "Async response: <think>\n",
+      "Okay, the user is asking for the capital of France. Let me think. I know that France is a country in Europe, and its capital is Paris. But wait, I should make sure I'm not confusing it with another country. For example, Germany's capital is Berlin, and Spain's is Madrid. France's capital is definitely Paris. I remember that Paris is a major city known for landmarks like the Eiffel Tower and the Louvre Museum. Also, the French government is based there, with the Elysée Palace as the official residence of the President. I don't think there's any ambiguity here. The answer should be straightforward. Just need to confirm once more to avoid any mistakes.\n",
+      "</think>\n",
+      "\n",
+      "The capital of France is **Paris**. It is a major global city known for its cultural, artistic, and historical significance, as well as landmarks such as the Eiffel Tower, Louvre Museum, and Notre-Dame Cathedral.\n",
+      "\n",
+      "Async streaming:\n",
+      "<think>\n",
+      "Okay, the user is asking for the capital of Germany. Let me think. I know that Germany is a country in Europe, and I remember that Berlin is the capital. Wait, but I should make sure. Sometimes people confuse capitals with other major cities, like Munich or Frankfurt. But no, Berlin is definitely the capital. It's where the government is located, and it's a major city. Let me double-check. Yes, after reunification in 1990, Berlin became the capital again. Before that, Bonn was the capital, but that was during the division of Germany. So the answer should be Berlin. I should also mention that it's the largest city in Germany. That way, the user gets a complete answer.\n",
+      "</think>\n",
+      "\n",
+      "The capital of Germany is **Berlin**. It is also the largest city in the country and serves as the political, cultural, and economic center of Germany. Berlin became the capital in 1990 following the reunification of East and West Germany."
+     ]
+    }
+   ],
+   "source": [
+    "import asyncio\n",
+    "\n",
+    "\n",
+    "async def generate_async():\n",
+    "    response = await chat.ainvoke(\"What is the capital of France?\")\n",
+    "    print(\"Async response:\", response.content)\n",
+    "\n",
+    "    # Async streaming\n",
+    "    print(\"\\nAsync streaming:\")\n",
+    "    async for chunk in chat.astream(\"What is the capital of Germany?\"):\n",
+    "        print(chunk.content, end=\"\", flush=True)\n",
+    "\n",
+    "\n",
+    "await generate_async()"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "a53a6bab",
+   "metadata": {},
+   "source": [
+    "### Available Models\n",
+    "\n",
+    "The full list of supported models can be found in the [Nebius AI Studio Documentation](https://studio.nebius.com/)."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "4aa82e17",
+   "metadata": {},
+   "source": [
+    "## Chaining\n",
+    "\n",
+    "You can use `ChatNebius` in LangChain chains and agents:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 9,
+   "id": "7e78e429",
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "<think>\n",
+      "Okay, the user asked me to explain how the internet works, but I need to do it in the style of Shakespeare. Let me start by recalling how the internet functions. It's a network of interconnected devices communicating via protocols like TCP/IP. Data is broken into packets, sent through routers, and reassembled at the destination.\n",
+      "\n",
+      "Now, translating that into Shakespearean language. I should use archaic terms and a poetic structure. Words like \"thou,\" \"doth,\" \"hark,\" and \"verily\" come to mind. Maybe start with a metaphor, like comparing the internet to a vast tapestry or a web. Mention nodes as \"nodes\" or \"stations,\" data packets as \"messengers\" or \"letters.\" Routers could be \"wayfarers\" or \"guides.\" The process of breaking data into packets might be likened to dividing a letter into parts for delivery. Emphasize the global aspect with \"across the globe\" or \"far and wide.\" Conclude with a flourish, perhaps a metaphor about connection and knowledge.\n",
+      "\n",
+      "I need to ensure the explanation is accurate but wrapped in the poetic and dramatic style of Shakespeare. Avoid modern jargon, use iambic pentameter if possible, and keep the flow natural. Let me piece it together step by step, checking that each part of the internet's function is covered metaphorically.\n",
+      "</think>\n",
+      "\n",
+      "Hark! List thy ear, good friend, to this most wondrous tale,  \n",
+      "Of threads unseen that bind the world in one grand tale.  \n",
+      "The Internet, a net most vast, doth span the globe,  \n",
+      "A labyrinth of light, where thoughts and data rove.  \n",
+      "\n",
+      "Behold! Each device, a node, doth hum and sing,  \n",
+      "Linked by wires and waves, where signals doth spring.  \n",
+      "They speak in tongues of ones and naughts, so pure,  \n",
+      "A code most ancient, yet evermore secure.  \n",
+      "\n",
+      "When thou dost send a thought, or word, or song,  \n",
+      "It breaks to parcels small, like letters on a long.  \n",
+      "Each parcel, a messenger, doth seek its way,  \n",
+      "Through routers wise, who guide them 'cross the day.  \n",
+      "\n",
+      "These wayfarers, with logic keen and bright,  \n",
+      "Choose paths most swift, through highways of light.  \n",
+      "They leap from tower to tower, far and wide,  \n",
+      "Till each parcel finds its mark, and joins the guide.  \n",
+      "\n",
+      "Then, like a scroll unrolled, the message grows,  \n",
+      "A tapestry of bits, in order it flows.  \n",
+      "Thus, thou dost speak to friend, or seek a tome,  \n",
+      "And lo! The world doth answer, quick as home.  \n",
+      "\n",
+      "So mark this truth: though vast, it's but a thread,  \n",
+      "A web of minds, where knowledge is widespread.  \n",
+      "The Internet, a stage where all may play,  \n",
+      "And none shall be alone, though far away.\n"
+     ]
+    }
+   ],
+   "source": [
+    "from langchain_core.output_parsers import StrOutputParser\n",
+    "from langchain_core.prompts import ChatPromptTemplate\n",
+    "\n",
+    "# Create a prompt template\n",
+    "prompt = ChatPromptTemplate.from_messages(\n",
+    "    [\n",
+    "        (\n",
+    "            \"system\",\n",
+    "            \"You are a helpful assistant that answers in the style of {character}.\",\n",
+    "        ),\n",
+    "        (\"human\", \"{query}\"),\n",
+    "    ]\n",
+    ")\n",
+    "\n",
+    "# Create a chain\n",
+    "chain = prompt | chat | StrOutputParser()\n",
+    "\n",
+    "# Invoke the chain\n",
+    "response = chain.invoke(\n",
+    "    {\"character\": \"Shakespeare\", \"query\": \"Explain how the internet works\"}\n",
+    ")\n",
+    "\n",
+    "print(response)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "f7a35f40",
+   "metadata": {},
+   "source": [
+    "## API reference\n",
+    "\n",
+    "For more details about the Nebius AI Studio API, visit the [Nebius AI Studio Documentation](https://studio.nebius.com/api-reference)."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "354ffc01",
+   "metadata": {},
+   "outputs": [],
+   "source": []
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": ".venv",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.12.8"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 5
+}
--- a/docs/docs/integrations/providers/nebius.mdx
+++ b/docs/docs/integrations/providers/nebius.mdx
@ -0,0 +1,118 @@
+# Nebius
+
+All functionality related to Nebius AI Studio
+
+>[Nebius AI Studio](https://studio.nebius.ai/) provides API access to a wide range of state-of-the-art large language models and embedding models for various use cases.
+
+## Installation and Setup
+
+The Nebius integration can be installed via pip:
+
+```bash
+pip install langchain-nebius
+```
+
+To use Nebius AI Studio, you'll need an API key which you can obtain from [Nebius AI Studio](https://studio.nebius.ai/). The API key can be passed as an initialization parameter `api_key` or set as the environment variable `NEBIUS_API_KEY`.
+
+```python
+import os
+os.environ["NEBIUS_API_KEY"] = "YOUR-NEBIUS-API-KEY"
+```
+
+### Available Models
+
+The full list of supported models can be found in the [Nebius AI Studio Documentation](https://studio.nebius.com/).
+
+
+## Chat models
+
+### ChatNebius
+
+The `ChatNebius` class allows you to interact with Nebius AI Studio's chat models.
+
+See a [usage example](/docs/integrations/chat/nebius).
+
+```python
+from langchain_nebius import ChatNebius
+
+# Initialize the chat model
+chat = ChatNebius(
+    model="Qwen/Qwen3-30B-A3B-fast",  # Choose from available models
+    temperature=0.6,
+    top_p=0.95
+)
+```
+
+## Embedding models
+
+### NebiusEmbeddings
+
+The `NebiusEmbeddings` class allows you to generate vector embeddings using Nebius AI Studio's embedding models.
+
+See a [usage example](/docs/integrations/text_embedding/nebius).
+
+```python
+from langchain_nebius import NebiusEmbeddings
+
+# Initialize embeddings
+embeddings = NebiusEmbeddings(
+    model="BAAI/bge-en-icl"  # Default embedding model
+)
+```
+
+## Retrievers
+
+### NebiusRetriever
+
+The `NebiusRetriever` enables efficient similarity search using embeddings from Nebius AI Studio. It leverages high-quality embedding models to enable semantic search over documents.
+
+See a [usage example](/docs/integrations/retrievers/nebius).
+
+```python
+from langchain_core.documents import Document
+from langchain_nebius import NebiusEmbeddings, NebiusRetriever
+
+# Create sample documents
+docs = [
+    Document(page_content="Paris is the capital of France"),
+    Document(page_content="Berlin is the capital of Germany"),
+]
+
+# Initialize embeddings
+embeddings = NebiusEmbeddings()
+
+# Create retriever
+retriever = NebiusRetriever(
+    embeddings=embeddings,
+    docs=docs,
+    k=2  # Number of documents to return
+)
+```
+
+## Tools
+
+### NebiusRetrievalTool
+
+The `NebiusRetrievalTool` allows you to create a tool for agents based on the NebiusRetriever.
+
+```python
+from langchain_nebius import NebiusEmbeddings, NebiusRetriever, NebiusRetrievalTool
+from langchain_core.documents import Document
+
+# Create sample documents
+docs = [
+    Document(page_content="Paris is the capital of France and has the Eiffel Tower"),
+    Document(page_content="Berlin is the capital of Germany and has the Brandenburg Gate"),
+]
+
+# Create embeddings and retriever
+embeddings = NebiusEmbeddings()
+retriever = NebiusRetriever(embeddings=embeddings, docs=docs)
+
+# Create retrieval tool
+tool = NebiusRetrievalTool(
+    retriever=retriever,
+    name="nebius_search",
+    description="Search for information about European capitals"
+)
+``` 
--- a/docs/docs/integrations/retrievers/nebius.ipynb
+++ b/docs/docs/integrations/retrievers/nebius.ipynb
@ -0,0 +1,514 @@
+{
+ "cells": [
+  {
+   "cell_type": "raw",
+   "id": "afaf8039",
+   "metadata": {},
+   "source": [
+    "---\n",
+    "sidebar_label: Nebius\n",
+    "---"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "2970dd75-8ebf-4b51-8282-9b454b8f356d",
+   "metadata": {},
+   "source": [
+    "# Nebius Retriever\n",
+    "\n",
+    "The `NebiusRetriever` enables efficient similarity search using embeddings from [Nebius AI Studio](https://studio.nebius.ai/). It leverages high-quality embedding models to enable semantic search over documents.\n",
+    "\n",
+    "This retriever is optimized for scenarios where you need to perform similarity search over a collection of documents, but don't need to persist the vectors to a vector database. It performs vector similarity search in-memory using matrix operations, making it efficient for medium-sized document collections."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "1c47fc36",
+   "metadata": {},
+   "source": [
+    "## Setup\n",
+    "\n",
+    "### Installation\n",
+    "\n",
+    "The Nebius integration can be installed via pip:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "1ecdb29d",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "%pip install --upgrade langchain-nebius"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "89883202",
+   "metadata": {},
+   "source": [
+    "### Credentials\n",
+    "\n",
+    "Nebius requires an API key that can be passed as an initialization parameter `api_key` or set as the environment variable `NEBIUS_API_KEY`. You can obtain an API key by creating an account on [Nebius AI Studio](https://studio.nebius.ai/)."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 11,
+   "id": "637bb53f",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "import getpass\n",
+    "import os\n",
+    "\n",
+    "# Make sure you've set your API key as an environment variable\n",
+    "if \"NEBIUS_API_KEY\" not in os.environ:\n",
+    "    os.environ[\"NEBIUS_API_KEY\"] = getpass.getpass(\"Enter your Nebius API key: \")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "8304b4d9",
+   "metadata": {},
+   "source": [
+    "## Instantiation\n",
+    "\n",
+    "The `NebiusRetriever` requires a `NebiusEmbeddings` instance and a list of documents. Here's how to initialize it:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 12,
+   "id": "37e9dc05",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from langchain_core.documents import Document\n",
+    "from langchain_nebius import NebiusEmbeddings, NebiusRetriever\n",
+    "\n",
+    "# Create sample documents\n",
+    "docs = [\n",
+    "    Document(\n",
+    "        page_content=\"Paris is the capital of France\", metadata={\"country\": \"France\"}\n",
+    "    ),\n",
+    "    Document(\n",
+    "        page_content=\"Berlin is the capital of Germany\", metadata={\"country\": \"Germany\"}\n",
+    "    ),\n",
+    "    Document(\n",
+    "        page_content=\"Rome is the capital of Italy\", metadata={\"country\": \"Italy\"}\n",
+    "    ),\n",
+    "    Document(\n",
+    "        page_content=\"Madrid is the capital of Spain\", metadata={\"country\": \"Spain\"}\n",
+    "    ),\n",
+    "    Document(\n",
+    "        page_content=\"London is the capital of the United Kingdom\",\n",
+    "        metadata={\"country\": \"UK\"},\n",
+    "    ),\n",
+    "    Document(\n",
+    "        page_content=\"Moscow is the capital of Russia\", metadata={\"country\": \"Russia\"}\n",
+    "    ),\n",
+    "    Document(\n",
+    "        page_content=\"Washington DC is the capital of the United States\",\n",
+    "        metadata={\"country\": \"USA\"},\n",
+    "    ),\n",
+    "    Document(\n",
+    "        page_content=\"Tokyo is the capital of Japan\", metadata={\"country\": \"Japan\"}\n",
+    "    ),\n",
+    "    Document(\n",
+    "        page_content=\"Beijing is the capital of China\", metadata={\"country\": \"China\"}\n",
+    "    ),\n",
+    "    Document(\n",
+    "        page_content=\"Canberra is the capital of Australia\",\n",
+    "        metadata={\"country\": \"Australia\"},\n",
+    "    ),\n",
+    "]\n",
+    "\n",
+    "# Initialize embeddings\n",
+    "embeddings = NebiusEmbeddings()\n",
+    "\n",
+    "# Create retriever\n",
+    "retriever = NebiusRetriever(\n",
+    "    embeddings=embeddings,\n",
+    "    docs=docs,\n",
+    "    k=3,  # Number of documents to return\n",
+    ")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "f5a731d2",
+   "metadata": {},
+   "source": [
+    "## Usage\n",
+    "\n",
+    "### Retrieve Relevant Documents\n",
+    "\n",
+    "You can use the retriever to find documents related to a query:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 13,
+   "id": "3ed26f78",
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "Query: What are some capitals in Europe?\n",
+      "Top 3 results:\n",
+      "1. Paris is the capital of France (Country: France)\n",
+      "2. Berlin is the capital of Germany (Country: Germany)\n",
+      "3. Rome is the capital of Italy (Country: Italy)\n"
+     ]
+    }
+   ],
+   "source": [
+    "# Query for European capitals\n",
+    "query = \"What are some capitals in Europe?\"\n",
+    "results = retriever.invoke(query)\n",
+    "\n",
+    "print(f\"Query: {query}\")\n",
+    "print(f\"Top {len(results)} results:\")\n",
+    "for i, doc in enumerate(results):\n",
+    "    print(f\"{i+1}. {doc.page_content} (Country: {doc.metadata['country']})\")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "72f31d5a",
+   "metadata": {},
+   "source": [
+    "### Using get_relevant_documents\n",
+    "\n",
+    "You can also use the `get_relevant_documents` method directly (though `invoke` is the preferred interface):"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 14,
+   "id": "e7b7170d",
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "Query: What are the capitals in Asia?\n",
+      "Top 3 results:\n",
+      "1. Beijing is the capital of China (Country: China)\n",
+      "2. Tokyo is the capital of Japan (Country: Japan)\n",
+      "3. Canberra is the capital of Australia (Country: Australia)\n"
+     ]
+    }
+   ],
+   "source": [
+    "# Query for Asian countries\n",
+    "query = \"What are the capitals in Asia?\"\n",
+    "results = retriever.get_relevant_documents(query)\n",
+    "\n",
+    "print(f\"Query: {query}\")\n",
+    "print(f\"Top {len(results)} results:\")\n",
+    "for i, doc in enumerate(results):\n",
+    "    print(f\"{i+1}. {doc.page_content} (Country: {doc.metadata['country']})\")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "8d6a31c2",
+   "metadata": {},
+   "source": [
+    "### Customizing Number of Results\n",
+    "\n",
+    "You can adjust the number of results at query time by passing `k` as a parameter:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 15,
+   "id": "5d81af33",
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "Query: Where is France?\n",
+      "Top 1 result:\n",
+      "1. Paris is the capital of France (Country: France)\n"
+     ]
+    }
+   ],
+   "source": [
+    "# Query for a specific country, with custom k\n",
+    "query = \"Where is France?\"\n",
+    "results = retriever.invoke(query, k=1)  # Override default k\n",
+    "\n",
+    "print(f\"Query: {query}\")\n",
+    "print(f\"Top {len(results)} result:\")\n",
+    "for i, doc in enumerate(results):\n",
+    "    print(f\"{i+1}. {doc.page_content} (Country: {doc.metadata['country']})\")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "3e8a40f1",
+   "metadata": {},
+   "source": [
+    "### Async Support\n",
+    "\n",
+    "NebiusRetriever supports async operations:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 16,
+   "id": "8fc36122",
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "Async query: What are some capital cities?\n",
+      "Top 3 results:\n",
+      "1. Washington DC is the capital of the United States (Country: USA)\n",
+      "2. Canberra is the capital of Australia (Country: Australia)\n",
+      "3. Paris is the capital of France (Country: France)\n"
+     ]
+    }
+   ],
+   "source": [
+    "import asyncio\n",
+    "\n",
+    "\n",
+    "async def retrieve_async():\n",
+    "    query = \"What are some capital cities?\"\n",
+    "    results = await retriever.ainvoke(query)\n",
+    "\n",
+    "    print(f\"Async query: {query}\")\n",
+    "    print(f\"Top {len(results)} results:\")\n",
+    "    for i, doc in enumerate(results):\n",
+    "        print(f\"{i+1}. {doc.page_content} (Country: {doc.metadata['country']})\")\n",
+    "\n",
+    "\n",
+    "await retrieve_async()"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "d5bc71e5",
+   "metadata": {},
+   "source": [
+    "### Handling Empty Documents"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 17,
+   "id": "123da4fb",
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "Number of results: 0\n"
+     ]
+    }
+   ],
+   "source": [
+    "# Create a retriever with empty documents\n",
+    "empty_retriever = NebiusRetriever(\n",
+    "    embeddings=embeddings,\n",
+    "    docs=[],\n",
+    "    k=2,  # Empty document list\n",
+    ")\n",
+    "\n",
+    "# Test the retriever with empty docs\n",
+    "results = empty_retriever.invoke(\"What are the capitals of European countries?\")\n",
+    "print(f\"Number of results: {len(results)}\")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "9db2f342",
+   "metadata": {},
+   "source": [
+    "## Use within a chain\n",
+    "\n",
+    "NebiusRetriever works seamlessly in LangChain RAG pipelines. Here's an example of creating a simple RAG chain with the NebiusRetriever:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 18,
+   "id": "e1e8c9f2",
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "Based on the context provided, three European capitals are:\n",
+      "\n",
+      "1. Paris\n",
+      "2. Berlin\n",
+      "3. Rome\n"
+     ]
+    }
+   ],
+   "source": [
+    "from langchain_core.output_parsers import StrOutputParser\n",
+    "from langchain_core.prompts import ChatPromptTemplate\n",
+    "from langchain_core.runnables import RunnablePassthrough\n",
+    "from langchain_nebius import ChatNebius\n",
+    "\n",
+    "# Initialize LLM\n",
+    "llm = ChatNebius(model=\"meta-llama/Llama-3.3-70B-Instruct-fast\")\n",
+    "\n",
+    "# Create a prompt template\n",
+    "prompt = ChatPromptTemplate.from_template(\n",
+    "    \"\"\"\n",
+    "Answer the question based only on the following context:\n",
+    "\n",
+    "Context:\n",
+    "{context}\n",
+    "\n",
+    "Question: {question}\n",
+    "\"\"\"\n",
+    ")\n",
+    "\n",
+    "\n",
+    "# Format documents function\n",
+    "def format_docs(docs):\n",
+    "    return \"\\n\\n\".join(doc.page_content for doc in docs)\n",
+    "\n",
+    "\n",
+    "# Create RAG chain\n",
+    "rag_chain = (\n",
+    "    {\"context\": retriever | format_docs, \"question\": RunnablePassthrough()}\n",
+    "    | prompt\n",
+    "    | llm\n",
+    "    | StrOutputParser()\n",
+    ")\n",
+    "\n",
+    "# Run the chain\n",
+    "answer = rag_chain.invoke(\"What are three European capitals?\")\n",
+    "print(answer)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "b3a6f2c4",
+   "metadata": {},
+   "source": [
+    "### Creating a Search Tool\n",
+    "\n",
+    "You can use the `NebiusRetrievalTool` to create a tool for agents:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 19,
+   "id": "784d53c4",
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "Tool results:\n",
+      "Document 1:\n",
+      "Paris is the capital of France\n",
+      "\n",
+      "Document 2:\n",
+      "Berlin is the capital of Germany\n",
+      "\n",
+      "Document 3:\n",
+      "Rome is the capital of Italy\n",
+      "\n"
+     ]
+    }
+   ],
+   "source": [
+    "from langchain_nebius import NebiusRetrievalTool\n",
+    "\n",
+    "# Create a retrieval tool\n",
+    "tool = NebiusRetrievalTool(\n",
+    "    retriever=retriever,\n",
+    "    name=\"capital_search\",\n",
+    "    description=\"Search for information about capital cities around the world\",\n",
+    ")\n",
+    "\n",
+    "# Use the tool\n",
+    "result = tool.invoke({\"query\": \"capitals in Europe\", \"k\": 3})\n",
+    "print(\"Tool results:\")\n",
+    "print(result)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "5a4a3453",
+   "metadata": {},
+   "source": [
+    "## How It Works\n",
+    "\n",
+    "The NebiusRetriever works by:\n",
+    "\n",
+    "1. During initialization:\n",
+    "   - It stores the provided documents\n",
+    "   - It uses the provided NebiusEmbeddings to compute embeddings for all documents\n",
+    "   - These embeddings are stored in memory for quick retrieval\n",
+    "\n",
+    "2. During retrieval (`invoke` or `get_relevant_documents`):\n",
+    "   - It embeds the query using the same embedding model\n",
+    "   - It computes similarity scores between the query embedding and all document embeddings\n",
+    "   - It returns the top-k documents sorted by similarity\n",
+    "\n",
+    "This approach is efficient for medium-sized document collections, as it avoids the need for a separate vector database while still providing high-quality semantic search."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "f7a35f40",
+   "metadata": {},
+   "source": [
+    "## API reference\n",
+    "\n",
+    "For more details about the Nebius AI Studio API, visit the [Nebius AI Studio Documentation](https://studio.nebius.com/api-reference)."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "96439983",
+   "metadata": {},
+   "source": []
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": ".venv",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.12.8"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 5
+}
--- a/docs/docs/integrations/text_embedding/nebius.ipynb
+++ b/docs/docs/integrations/text_embedding/nebius.ipynb
@ -0,0 +1,447 @@
+{
+ "cells": [
+  {
+   "cell_type": "raw",
+   "id": "afaf8039",
+   "metadata": {},
+   "source": [
+    "---\n",
+    "sidebar_label: Nebius\n",
+    "---"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "2970dd75-8ebf-4b51-8282-9b454b8f356d",
+   "metadata": {},
+   "source": [
+    "# Nebius Text Embeddings\n",
+    "\n",
+    "[Nebius AI Studio](https://studio.nebius.ai/) provides API access to high-quality embedding models through a unified interface. The Nebius embedding models convert text into numerical vectors that capture semantic meaning, making them useful for various applications like semantic search, clustering, and recommendations."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "overview-section",
+   "metadata": {},
+   "source": [
+    "## Overview\n",
+    "\n",
+    "The `NebiusEmbeddings` class provides access to Nebius AI Studio's embedding models through LangChain. These embeddings can be used for semantic search, document similarity, and other NLP tasks requiring vector representations of text.\n",
+    "\n",
+    "### Integration details\n",
+    "\n",
+    "- **Provider**: Nebius AI Studio\n",
+    "- **Model Types**: Text embedding models\n",
+    "- **Primary Use Case**: Generate vector representations of text for semantic similarity and retrieval\n",
+    "- **Available Models**: Various embedding models including BAAI/bge-en-icl and others\n",
+    "- **Dimensions**: Varies by model (typically 1024-4096 dimensions)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "setup-section",
+   "metadata": {},
+   "source": [
+    "## Setup\n",
+    "\n",
+    "### Installation\n",
+    "\n",
+    "The Nebius integration can be installed via pip:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "1ecdb29d",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "%pip install --upgrade langchain-nebius"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "89883202",
+   "metadata": {},
+   "source": [
+    "### Credentials\n",
+    "\n",
+    "Nebius requires an API key that can be passed as an initialization parameter `api_key` or set as the environment variable `NEBIUS_API_KEY`. You can obtain an API key by creating an account on [Nebius AI Studio](https://studio.nebius.ai/)."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 1,
+   "id": "637bb53f",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "import getpass\n",
+    "import os\n",
+    "\n",
+    "# Make sure you've set your API key as an environment variable\n",
+    "if \"NEBIUS_API_KEY\" not in os.environ:\n",
+    "    os.environ[\"NEBIUS_API_KEY\"] = getpass.getpass(\"Enter your Nebius API key: \")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "instantiation-section",
+   "metadata": {},
+   "source": [
+    "## Instantiation\n",
+    "\n",
+    "The `NebiusEmbeddings` class can be instantiated with optional parameters for the API key and model name:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 2,
+   "id": "37e9dc05",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from langchain_nebius import NebiusEmbeddings\n",
+    "\n",
+    "# Initialize the embeddings model\n",
+    "embeddings = NebiusEmbeddings(\n",
+    "    # api_key=\"YOUR_API_KEY\",  # You can pass the API key directly\n",
+    "    model=\"BAAI/bge-en-icl\"  # The default embedding model\n",
+    ")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "available-models",
+   "metadata": {},
+   "source": [
+    "### Available Models\n",
+    "\n",
+    "The list of supported models is available at https://studio.nebius.com/?modality=embedding"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "indexing-retrieval-section",
+   "metadata": {},
+   "source": [
+    "## Indexing and Retrieval\n",
+    "\n",
+    "Embedding models are often used in retrieval-augmented generation (RAG) flows, both for indexing data and later retrieving it. The following example demonstrates how to use `NebiusEmbeddings` with a vector store for document retrieval."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 3,
+   "id": "123da4fb",
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "Search results for query: How does the brain influence AI?\n",
+      "Result 1: Neural networks are inspired by the human brain's structure\n",
+      "Result 2: Deep learning uses neural networks with many layers\n"
+     ]
+    }
+   ],
+   "source": [
+    "from langchain_community.vectorstores import FAISS\n",
+    "from langchain_core.documents import Document\n",
+    "\n",
+    "# Prepare documents\n",
+    "docs = [\n",
+    "    Document(\n",
+    "        page_content=\"Machine learning algorithms build mathematical models based on sample data\"\n",
+    "    ),\n",
+    "    Document(page_content=\"Deep learning uses neural networks with many layers\"),\n",
+    "    Document(page_content=\"Climate change is a major global environmental challenge\"),\n",
+    "    Document(\n",
+    "        page_content=\"Neural networks are inspired by the human brain's structure\"\n",
+    "    ),\n",
+    "]\n",
+    "\n",
+    "# Create vector store\n",
+    "vector_store = FAISS.from_documents(docs, embeddings)\n",
+    "\n",
+    "# Perform similarity search\n",
+    "query = \"How does the brain influence AI?\"\n",
+    "results = vector_store.similarity_search(query, k=2)\n",
+    "\n",
+    "print(\"Search results for query:\", query)\n",
+    "for i, doc in enumerate(results):\n",
+    "    print(f\"Result {i+1}: {doc.page_content}\")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "in-memory-vectorstore",
+   "metadata": {},
+   "source": [
+    "### Using with InMemoryVectorStore\n",
+    "\n",
+    "You can also use the `InMemoryVectorStore` for lightweight applications:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 4,
+   "id": "vectorstore-example",
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "Retrieved document: LangChain is a framework for developing applications powered by language models\n"
+     ]
+    }
+   ],
+   "source": [
+    "from langchain_core.vectorstores import InMemoryVectorStore\n",
+    "\n",
+    "# Create a sample text\n",
+    "text = \"LangChain is a framework for developing applications powered by language models\"\n",
+    "\n",
+    "# Create a vector store\n",
+    "vectorstore = InMemoryVectorStore.from_texts(\n",
+    "    [text],\n",
+    "    embedding=embeddings,\n",
+    ")\n",
+    "\n",
+    "# Use as a retriever\n",
+    "retriever = vectorstore.as_retriever()\n",
+    "\n",
+    "# Retrieve similar documents\n",
+    "docs = retriever.invoke(\"What is LangChain?\")\n",
+    "print(f\"Retrieved document: {docs[0].page_content}\")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "direct-usage-section",
+   "metadata": {},
+   "source": [
+    "## Direct Usage\n",
+    "\n",
+    "You can directly use the `NebiusEmbeddings` class to generate embeddings for text without using a vector store."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "f5a731d2",
+   "metadata": {},
+   "source": [
+    "### Embedding a Single Text\n",
+    "\n",
+    "You can use the `embed_query` method to embed a single piece of text:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 5,
+   "id": "3ed26f78",
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "Embedding dimension: 4096\n",
+      "First few values: [0.007419586181640625, 0.002246856689453125, 0.00193023681640625, -0.0066070556640625, -0.0179901123046875]\n"
+     ]
+    }
+   ],
+   "source": [
+    "query = \"What is machine learning?\"\n",
+    "query_embedding = embeddings.embed_query(query)\n",
+    "\n",
+    "# Check the embedding dimension\n",
+    "print(f\"Embedding dimension: {len(query_embedding)}\")\n",
+    "print(f\"First few values: {query_embedding[:5]}\")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "72f31d5a",
+   "metadata": {},
+   "source": [
+    "### Embedding Multiple Texts\n",
+    "\n",
+    "You can embed multiple texts at once using the `embed_documents` method:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 6,
+   "id": "e7b7170d",
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "Number of document embeddings: 3\n",
+      "Each embedding has 4096 dimensions\n"
+     ]
+    }
+   ],
+   "source": [
+    "documents = [\n",
+    "    \"Machine learning is a branch of artificial intelligence\",\n",
+    "    \"Deep learning is a subfield of machine learning\",\n",
+    "    \"Natural language processing deals with interactions between computers and human language\",\n",
+    "]\n",
+    "\n",
+    "document_embeddings = embeddings.embed_documents(documents)\n",
+    "\n",
+    "# Check the results\n",
+    "print(f\"Number of document embeddings: {len(document_embeddings)}\")\n",
+    "print(f\"Each embedding has {len(document_embeddings[0])} dimensions\")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "3e8a40f1",
+   "metadata": {},
+   "source": [
+    "### Async Support\n",
+    "\n",
+    "NebiusEmbeddings supports async operations:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 7,
+   "id": "8fc36122",
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "Async query embedding dimension: 4096\n",
+      "Async document embeddings count: 3\n"
+     ]
+    }
+   ],
+   "source": [
+    "import asyncio\n",
+    "\n",
+    "\n",
+    "async def generate_embeddings_async():\n",
+    "    # Embed a single query\n",
+    "    query_result = await embeddings.aembed_query(\"What is the capital of France?\")\n",
+    "    print(f\"Async query embedding dimension: {len(query_result)}\")\n",
+    "\n",
+    "    # Embed multiple documents\n",
+    "    docs = [\n",
+    "        \"Paris is the capital of France\",\n",
+    "        \"Berlin is the capital of Germany\",\n",
+    "        \"Rome is the capital of Italy\",\n",
+    "    ]\n",
+    "    docs_result = await embeddings.aembed_documents(docs)\n",
+    "    print(f\"Async document embeddings count: {len(docs_result)}\")\n",
+    "\n",
+    "\n",
+    "await generate_embeddings_async()"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "4aa82e17",
+   "metadata": {},
+   "source": [
+    "### Document Similarity Example"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 8,
+   "id": "7e78e429",
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "Document Similarity Matrix:\n",
+      "Document 1: ['1.0000', '0.8282', '0.5811', '0.7985']\n",
+      "Document 2: ['0.8282', '1.0000', '0.5897', '0.8315']\n",
+      "Document 3: ['0.5811', '0.5897', '1.0000', '0.5918']\n",
+      "Document 4: ['0.7985', '0.8315', '0.5918', '1.0000']\n"
+     ]
+    }
+   ],
+   "source": [
+    "import numpy as np\n",
+    "from scipy.spatial.distance import cosine\n",
+    "\n",
+    "# Create some documents\n",
+    "documents = [\n",
+    "    \"Machine learning algorithms build mathematical models based on sample data\",\n",
+    "    \"Deep learning uses neural networks with many layers\",\n",
+    "    \"Climate change is a major global environmental challenge\",\n",
+    "    \"Neural networks are inspired by the human brain's structure\",\n",
+    "]\n",
+    "\n",
+    "# Embed the documents\n",
+    "embeddings_list = embeddings.embed_documents(documents)\n",
+    "\n",
+    "\n",
+    "# Function to calculate similarity\n",
+    "def calculate_similarity(embedding1, embedding2):\n",
+    "    return 1 - cosine(embedding1, embedding2)\n",
+    "\n",
+    "\n",
+    "# Print similarity matrix\n",
+    "print(\"Document Similarity Matrix:\")\n",
+    "for i, emb_i in enumerate(embeddings_list):\n",
+    "    similarities = []\n",
+    "    for j, emb_j in enumerate(embeddings_list):\n",
+    "        similarity = calculate_similarity(emb_i, emb_j)\n",
+    "        similarities.append(f\"{similarity:.4f}\")\n",
+    "    print(f\"Document {i+1}: {similarities}\")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "f7a35f40",
+   "metadata": {},
+   "source": [
+    "## API Reference\n",
+    "\n",
+    "For more details about the Nebius AI Studio API, visit the [Nebius AI Studio Documentation](https://studio.nebius.ai/docs/api-reference)."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "eb1eb70d",
+   "metadata": {},
+   "source": []
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": ".venv",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.12.8"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 5
+}
--- a/libs/packages.yml
+++ b/libs/packages.yml
@ -664,4 +664,7 @@ packages:
  path: .
 - name: langchain-featherless-ai
  repo: featherlessai/langchain-featherless-ai
-  path: .
+  path: .
+- name: langchain-nebius
+  path: libs/nebius
+  repo: nebius/langchain-nebius