> ## Documentation Index
> Fetch the complete documentation index at: https://docs.trymaitai.com/llms.txt
> Use this file to discover all available pages before exploring further.

# File Inputs

> Send images, video, and documents to multimodal models as part of a chat completion.

Maitai lets you attach files — **images, video, and documents (PDFs)** — to a
chat completion request. Maitai stores the file, hands it to the model in the
format that provider expects, and runs inference.

There are two ways to send a file, and you can mix them freely:

1. **Inline with the request** — attach the file directly in the message. Best
   for one-off requests; nothing to pre-upload.
2. **Upload first, then reference by `file_id`** — call `files.upload(...)` once
   to get a `file_id`, then reference it on any number of later requests without
   re-sending the bytes. Best when you ask multiple questions about the same
   (especially large) file.

## Quickstart

Attach a local file inline. Maitai reads the bytes, validates that the target
model accepts that file type, and includes it in the request.

<CodeGroup>
  ```python Python theme={null}
  import maitai

  client = maitai.Maitai()

  messages = [
      {
          "role": "user",
          "content": [
              {"type": "text", "text": "Describe what happens in this video."},
              maitai.file_content_part_from_path("clip.mp4"),
          ],
      }
  ]

  response = client.chat.completions.create(
      messages=messages,
      application="demo_app",
      intent="VIDEO_ANALYSIS",
      session_id="YOUR_SESSION_ID",
      model="gemini-3.5-flash",
      server_side_inference=True,
  )

  print(response.choices[0].message.content)
  ```

  ```javascript Node theme={null}
  import fs from "node:fs";
  import Maitai, { fileContentPartFromBytes } from "maitai";

  const maitai = new Maitai();

  const bytes = fs.readFileSync("clip.mp4");

  const messages = [
    {
      role: "user",
      content: [
        { type: "text", text: "Describe what happens in this video." },
        fileContentPartFromBytes(bytes, "video/mp4", "clip.mp4"),
      ],
    },
  ];

  const response = await maitai.chat.completions.create({
    messages,
    application: "demo_app",
    intent: "VIDEO_ANALYSIS",
    session_id: "YOUR_SESSION_ID",
    model: "gemini-3.5-flash",
    server_side_inference: true,
  });

  console.log(response.choices[0].message.content);
  ```
</CodeGroup>

<Note>
  File inputs require server-side inference (`server_side_inference=True`), which
  is the default. Maitai needs to run the request to upload and convert the file
  for the provider.
</Note>

## What's supported

File support depends on the model. Maitai validates the file against the target
model and returns a `400` if the model does not accept that file type.

| Provider                             | Image | Video | Document (PDF) |
| ------------------------------------ | :---: | :---: | :------------: |
| **Gemini** (e.g. `gemini-3.5-flash`) |   ✓   |   ✓   |        ✓       |
| **OpenAI** (e.g. `gpt-4o`)           |   ✓   |   —   |        ✓       |

<Warning>
  **Video is currently supported on Gemini models only.** Sending a video to a
  model that does not support it (for example an OpenAI model) returns a `400`
  error — Maitai will not silently route it elsewhere. Choose a Gemini model for
  video.
</Warning>

Supported file types:

* **Images** — PNG, JPEG, GIF, WebP
* **Video** — MP4, WebM, MOV (Gemini)
* **Documents** — PDF

## The `file` content part

A file is just another entry in a message's `content` array. It has the shape:

```json theme={null}
{ "type": "file", "file": { "file_data": "data:video/mp4;base64,<...>", "filename": "clip.mp4" } }
```

You normally don't write this by hand — use the helpers below.

### Python

<ParamField path="maitai.file_content_part_from_path(path, mime_type=None)" type="dict">
  Build an inline file part from a local file path. Reads the file and
  base64-encodes it. The MIME type is inferred from the file extension when not
  provided.
</ParamField>

```python theme={null}
import maitai

# Image
maitai.file_content_part_from_path("diagram.png")

# Document
maitai.file_content_part_from_path("report.pdf")

# Explicit MIME type
maitai.file_content_part_from_path("recording", mime_type="video/mp4")
```

### Node

<ParamField path="fileContentPartFromBytes(bytes, mimeType, filename?)" type="object">
  Build an inline file part from raw bytes (`Uint8Array` or `Buffer`). The MIME
  type is required; `filename` is optional but recommended for documents.
</ParamField>

```javascript theme={null}
import fs from "node:fs";
import { fileContentPartFromBytes } from "maitai";

const bytes = fs.readFileSync("diagram.png");
fileContentPartFromBytes(bytes, "image/png", "diagram.png");
```

## Multiple files in one message

Combine text and several files in a single message. Order is preserved.

<CodeGroup>
  ```python Python theme={null}
  messages = [
      {
          "role": "user",
          "content": [
              {"type": "text", "text": "Compare these two screenshots."},
              maitai.file_content_part_from_path("before.png"),
              maitai.file_content_part_from_path("after.png"),
          ],
      }
  ]
  ```

  ```javascript Node theme={null}
  import fs from "node:fs";
  import Maitai, { fileContentPartFromBytes } from "maitai";

  const messages = [
    {
      role: "user",
      content: [
        { type: "text", text: "Compare these two screenshots." },
        fileContentPartFromBytes(fs.readFileSync("before.png"), "image/png", "before.png"),
        fileContentPartFromBytes(fs.readFileSync("after.png"), "image/png", "after.png"),
      ],
    },
  ];
  ```
</CodeGroup>

## Reuse a file across requests

When you'll reference the same file in more than one request, upload it once
with `files.upload(...)` to get a `file_id`, then pass that `file_id` with
`file_content_part` (Python) / `fileContentPart` (Node). This avoids re-sending
the bytes — and for video, avoids re-preparing it for the provider — on every
call.

<CodeGroup>
  ```python Python theme={null}
  import maitai

  client = maitai.Maitai()

  # Upload once
  file_id = client.files.upload("clip.mp4")

  # Reference it across as many requests as you like
  for question in ["What happens first?", "Who is speaking?", "Summarize it."]:
      response = client.chat.completions.create(
          messages=[
              {
                  "role": "user",
                  "content": [
                      {"type": "text", "text": question},
                      maitai.file_content_part(file_id),
                  ],
              }
          ],
          application="demo_app",
          intent="VIDEO_ANALYSIS",
          model="gemini-3.5-flash",
          server_side_inference=True,
      )
      print(response.choices[0].message.content)
  ```

  ```javascript Node theme={null}
  import fs from "node:fs";
  import Maitai, { fileContentPart } from "maitai";

  const maitai = new Maitai();

  // Upload once (from a path, or use uploadBytes for in-memory data)
  const fileId = await maitai.files.upload("clip.mp4");

  // Reference it across as many requests as you like
  for (const question of ["What happens first?", "Who is speaking?", "Summarize it."]) {
    const response = await maitai.chat.completions.create({
      messages: [
        {
          role: "user",
          content: [
            { type: "text", text: question },
            fileContentPart(fileId),
          ],
        },
      ],
      application: "demo_app",
      intent: "VIDEO_ANALYSIS",
      model: "gemini-3.5-flash",
      server_side_inference: true,
    });
    console.log(response.choices[0].message.content);
  }
  ```
</CodeGroup>

### Upload methods

<ParamField path="client.files.upload(path, mime_type=None)" type="str (Python)">
  Upload a file from a local path; returns the `file_id`. MIME type is inferred
  from the extension when not provided.
</ParamField>

<ParamField path="client.files.upload_bytes(data, filename, mime_type=None)" type="str (Python)">
  Upload raw bytes; returns the `file_id`.
</ParamField>

<ParamField path="maitai.files.upload(filePath, mimeType?)" type="Promise<string> (Node)">
  Upload a file from a local path; resolves to the `file_id`.
</ParamField>

<ParamField path="maitai.files.uploadBytes(bytes, filename, mimeType?)" type="Promise<string> (Node)">
  Upload raw bytes (`Uint8Array` or `Buffer`); resolves to the `file_id`.
</ParamField>

<ParamField path="file_content_part(file_id)" type="content part">
  `maitai.file_content_part(file_id)` (Python) / `fileContentPart(fileId)` (Node)
  builds the content part that references an uploaded file.
</ParamField>

## Image URLs

For images already hosted at a public URL, you can use the standard OpenAI
`image_url` content part instead of uploading bytes:

<CodeGroup>
  ```python Python theme={null}
  messages = [
      {
          "role": "user",
          "content": [
              {"type": "text", "text": "What's in this image?"},
              {
                  "type": "image_url",
                  "image_url": {"url": "https://example.com/photo.jpg"},
              },
          ],
      }
  ]
  ```

  ```javascript Node theme={null}
  const messages = [
    {
      role: "user",
      content: [
        { type: "text", text: "What's in this image?" },
        { type: "image_url", image_url: { url: "https://example.com/photo.jpg" } },
      ],
    },
  ];
  ```
</CodeGroup>

## Errors

<ResponseField name="400 — model does not accept <type> file inputs" type="error">
  The target model does not support that file type. For example, sending a video
  to an OpenAI model. Switch to a model that supports it (a Gemini model for
  video), or remove the file.
</ResponseField>

## Notes

* **Video is uploaded once and prepared for the model**, which can take a few
  seconds for larger files; this happens on the first request that references it.
* Files are scoped to your company and are not shared across accounts.
* File inputs work with the rest of the request as usual — combine them with
  [Structured Output](/sdk/structured_output), [Tool Calling](/sdk/tool_calling),
  and streaming.
