POST
/
api
/
datasets
/
{id}
/
upload
curl --request POST \
  --url https://app.baseplate.ai/api/datasets/{id}/upload \
  --header 'Authorization: Bearer <token>' \
  --header 'Content-Type: <content-type>' \
  --data '{
  "chunk_size": 123,
  "chunk_overlap": 123,
  "document-id": "<string>",
  "metadata": "<string>"
}'

The tester above and examples on the right side of this page will not work since this API uses mutlipart form data. See bottom of page for examples.

Path Parameters

id
string
required

Dataset ID

Authorization
string
required

Baseplate API key. Must be in the format โ€œBearer $BASEPLATE_API_KEYโ€

Content-Type
string
required

Use multipart/form-data

Body

chunk_size
number
default: "1000"

Chunk size to chunk documents. Defaults to 1000 characters. Limited to 2000 for hybrid and 5000 for standard datasets.

chunk_overlap
number
default: "100"

Overlap size for chunks. Must be smaller than chunk size.

file
file
required

Attach a file to the request form-data. PDFs and csvs are automatically parsed. For csv files, if you need certain cells to be added as metadata, make sure the metadata columms already exist in the dataset.

For csv files, make sure there is a column named โ€œtextโ€, or else no embeddings will be generated. This column is the one you want embedded.

document-id
string

The document ID to create for this file. If none is provided, a uuidv4 ID is generated for you.

metadata
string

JSON stringified metadata to attach to the document. Must be a valid JSON string. Supported metadata value types are: string, number, boolean, and array of strings.

{
  "key1": "value1",
  "key2": 2,
  "key3": true,
  "key4": ["value4", "value5"]
}

Responses

{
  "body": "OK"
}

Examples

Replace DATASETIDandDATASET_ID and BASEPLATE_API_KEY with your own values.

Python

import requests
import io
url = "https://app.baseplate.ai/api/datasets/$DATASET_ID/upload"
response = requests.get("https://jsonplaceholder.typicode.com/todos/1")
response.raise_for_status()

payload={'chunk_size': '700',
'chunk_overlap': '100',
'document_id': 'test_document'}
file_like_object = io.BytesIO(response.content)
files=[
  ('file',('file.txt',file_like_object,'application/text'))
]
headers = {
  'Authorization': 'Bearer $BASEPLATE_API_KEY',
   'Content-Type': 'multipart/form-data'
}

response = requests.request("POST", url, headers=headers, data=payload, files=files)

print(response.text)
import requests

url = "https://app.baseplate.ai/api/datasets/$DATASET_ID/upload"

payload={'chunk_size': '700',
'chunk_overlap': '100',
'document_id': '$DOC_ID'}
files=[
  ('file',('file.txt',open('file.txt','rb'),'application/text'))
]
headers = {
  'Authorization': 'Bearer $BASEPLATE_API_KEY',
  'Content-Type': 'multipart/form-data'
}

response = requests.request("POST", url, headers=headers, data=payload, files=files)

print(response.text)

Node.js (Axios)

var axios = require("axios");
var FormData = require("form-data");
var fs = require("fs");
var data = new FormData();
data.append("file", fs.createReadStream("/path/to/file.txt"));
data.append("chunk_size", "3000");
data.append("chunk_overlap", "50");

var config = {
  method: "post",
  maxBodyLength: Infinity,
  url: "https://app.baseplate.ai/api/datasets/$DATASET_ID/upload",
  headers: {
    "Content-Type": "multipart/form-data",
    Authorization: "Bearer $BASEPLATE_API_KEY",
    ...data.getHeaders(),
  },
  data: data,
};

axios(config)
  .then(function (response) {
    console.log(JSON.stringify(response.data));
  })
  .catch(function (error) {
    console.log(error);
  });

Javascript (Fetch)

var myHeaders = new Headers();
myHeaders.append("Content-Type", "multipart/form-data");
myHeaders.append("Authorization", "Bearer $BASEPLATE_API_KEY");

var formdata = new FormData();
formdata.append("file", fileInput.files[0], "file.txt");
formdata.append("chunk_size", "3000");
formdata.append("chunk_overlap", "50");

var requestOptions = {
  method: "POST",
  headers: myHeaders,
  body: formdata,
  redirect: "follow",
};

fetch(
  "https://app.baseplate.ai/api/datasets/$DATASET_ID/upload",
  requestOptions
)
  .then((response) => response.text())
  .then((result) => console.log(result))
  .catch((error) => console.log("error", error));

cURL

curl --location 'https://app.baseplate.ai/api/datasets/$DATASET_ID/upload' \
--header 'Content-Type: multipart/form-data' \
--header 'Authorization: Bearer $BASEPLATE_API_KEY' \
--form 'file=@"/path/to/file.txt"' \
--form 'chunk_size="3000"' \
--form 'chunk_overlap="50"'