So you want to upload large files from your users browsers, directly to S3. You’ll find plenty of examples of how to do this by uploading the whole thing in one go, but S3 supports multipart uploads - where you divide the file into chunks, which you can then upload in parallel (and even retry if something goes wrong!).
The example
video-uploader. The includes a bunch of pieces:
- Frontend that allows picking a file and uploading it
- Backend lambdas (these wrap the amazon API calls, the data being uploaded to S3 do not pass through a lambda)
- CloudFormation to deploy the frontend + lambdas + API gateways + CloudFront.
While the upload code is not a seperate component, the files in the frontend/src/uploader
directory do not modify the DOM directly so should be relatively easy to drop into other projects.
What you have to do
- (Frontend) User chooses a file
- (Backend) Create a Multipart Upload
- (Backend) Create a signed URL for each part
- (Frontend) Upload each part
- (Backend) Complete upload
- Handle failed uploads
(Frontend) User chooses a file
- Let the user pick a file, using an
<input type="file">
. From here on we’ll refer to this input using the variablefileInput
(in the example I get this using document.getElementById. - When ready begin uploading, look at the array
fileInput.files
. The example only allows a single file at once, so we’ll refer to it asfile
. - You can then find the size (
file.size
, in bytes), name (file.name
).
(Backend) Create a Multipart Upload
This requires AWS credentials, so I did this using a lambda beginUpload - most of that function just performs authentication/authorisation checks to prevent abuse, the key part is client.createMultipartUpload which could be as simple as
const { UploadId } = await client.createMultipartUpload({ Bucket: bucket, Key: objectName, }).promise();
You’ll need to use this UploadId
when creating signed URLs and when completeing the upload. The frontend doesn’t need to know the value of UploadId
, but you may find it easier to have the browser include this when asking for signed URLs & indicating the upload is finished. The example uses a signed JWT to prevent the end user modifying the UploadId.
(Backend) Create a signed URL for each part
Example: getUploadURL
This creates URLs that the browser can upload each part to, and includes a signature so no further authentication is required. The PartNumber
is 1-indexed (e.g. if you are going to upload the file in 3 parts, use 1
, 2
, 3
). The Expires
is in seconds - if you call this method immediately before each part then it doesn’t need to be too long.
const signedURL = await client.getSignedUrlPromise('uploadPart', {
Bucket: bucket,
Key: objectName,
Expires: 30 * 60,
UploadId: uploadId,
PartNumber: partNumber,
});
(Frontend) Upload each part
I upload files in 10,000,000 byte chunks (const FILE_CHUNK_SIZE = 10_000_000;
).
- Obtain a signed URL for this part
- Get a blob for this part of the file:
const slice = file.slice(partNumber * FILE_CHUNK_SIZE, Math.min((partNumber + 1) * FILE_CHUNK_SIZE, file.size));
- Use XHR/Fetch to
put
to the signed URL. I use axios:const output = await axios.put(uploadUrl, blob);
- Keep track of the
etag
header, you’ll need this to finish the upload:const etag = (output.headers as { etag: string }).etag;`);
(Backend) Complete upload
Example: finishUpload
This tells S3 to assemble your multipart chunks into a single file. You’ll need an array with the etag
from each succesfully uploaded chunk:
await client
.completeMultipartUpload({
Bucket: bucket,
Key: objectName,
UploadId: uploadId,
MultipartUpload: { Parts: [{ ETag: 'part1 etag', PartNumber: 1 }, ...] },
})
.promise();
Handle failed uploads
Incomplete uploads use S3 storage, so you should clean up incomplete uploads. There’s a couple of ways to do this:
-
Tell S3 when you know an upload has failed. The example has abandonUpload for this.
-
Use a S3 lifecycle rule. The example uses CloudFormation to set AbortIncompleteMultipartUpload to delete uploads that aren’t completed within 2 days.