Upload a Document (file) to a Dataset
Upload a document directly and specify chunk size/overlap, and metadata. Limit to 100MB uploads
The tester above and examples on the right side of this page will not work since this API uses mutlipart form data. See bottom of page for examples.
Path Parameters
Dataset ID
Header
Baseplate API key. Must be in the format βBearer $BASEPLATE_API_KEYβ
Use multipart/form-data
Body
Chunk size to chunk documents. Defaults to 1000 characters. Limited to 2000 for hybrid and 5000 for standard datasets.
Overlap size for chunks. Must be smaller than chunk size.
Attach a file to the request form-data. PDFs and csvs are automatically parsed. For csv files, if you need certain cells to be added as metadata, make sure the metadata columms already exist in the dataset.
For csv files, make sure there is a column named βtextβ, or else no embeddings will be generated. This column is the one you want embedded.
The document ID to create for this file. If none is provided, a uuidv4 ID is generated for you.
JSON stringified metadata to attach to the document. Must be a valid JSON string. Supported metadata value types are: string, number, boolean, and array of strings.
{
"key1": "value1",
"key2": 2,
"key3": true,
"key4": ["value4", "value5"]
}
Responses
{
"body": "OK"
}
Examples
Replace BASEPLATE_API_KEY with your own values.
Python
import requests
import io
url = "https://app.baseplate.ai/api/datasets/$DATASET_ID/upload"
response = requests.get("https://jsonplaceholder.typicode.com/todos/1")
response.raise_for_status()
payload={'chunk_size': '700',
'chunk_overlap': '100',
'document_id': 'test_document'}
file_like_object = io.BytesIO(response.content)
files=[
('file',('file.txt',file_like_object,'application/text'))
]
headers = {
'Authorization': 'Bearer $BASEPLATE_API_KEY',
'Content-Type': 'multipart/form-data'
}
response = requests.request("POST", url, headers=headers, data=payload, files=files)
print(response.text)
import requests
url = "https://app.baseplate.ai/api/datasets/$DATASET_ID/upload"
payload={'chunk_size': '700',
'chunk_overlap': '100',
'document_id': '$DOC_ID'}
files=[
('file',('file.txt',open('file.txt','rb'),'application/text'))
]
headers = {
'Authorization': 'Bearer $BASEPLATE_API_KEY',
'Content-Type': 'multipart/form-data'
}
response = requests.request("POST", url, headers=headers, data=payload, files=files)
print(response.text)
Node.js (Axios)
var axios = require("axios");
var FormData = require("form-data");
var fs = require("fs");
var data = new FormData();
data.append("file", fs.createReadStream("/path/to/file.txt"));
data.append("chunk_size", "3000");
data.append("chunk_overlap", "50");
var config = {
method: "post",
maxBodyLength: Infinity,
url: "https://app.baseplate.ai/api/datasets/$DATASET_ID/upload",
headers: {
"Content-Type": "multipart/form-data",
Authorization: "Bearer $BASEPLATE_API_KEY",
...data.getHeaders(),
},
data: data,
};
axios(config)
.then(function (response) {
console.log(JSON.stringify(response.data));
})
.catch(function (error) {
console.log(error);
});
Javascript (Fetch)
var myHeaders = new Headers();
myHeaders.append("Content-Type", "multipart/form-data");
myHeaders.append("Authorization", "Bearer $BASEPLATE_API_KEY");
var formdata = new FormData();
formdata.append("file", fileInput.files[0], "file.txt");
formdata.append("chunk_size", "3000");
formdata.append("chunk_overlap", "50");
var requestOptions = {
method: "POST",
headers: myHeaders,
body: formdata,
redirect: "follow",
};
fetch(
"https://app.baseplate.ai/api/datasets/$DATASET_ID/upload",
requestOptions
)
.then((response) => response.text())
.then((result) => console.log(result))
.catch((error) => console.log("error", error));
cURL
curl --location 'https://app.baseplate.ai/api/datasets/$DATASET_ID/upload' \
--header 'Content-Type: multipart/form-data' \
--header 'Authorization: Bearer $BASEPLATE_API_KEY' \
--form 'file=@"/path/to/file.txt"' \
--form 'chunk_size="3000"' \
--form 'chunk_overlap="50"'