AWS

AWS : S3 Storage

What are we talking about this time?

This time we are going to talk about AWS S3 storage. I have chosen to start with S3, as storage lies at the heart of a great many cloud services, both AWS and Azure. For example

 

  • Use S3 blobs to create external SQL tables (AWS Athena)
  • Use S3 storage with Kafka
  • Use S3 with data warehouses such as AWS Redshift
  • Use S3 with Apache Spark
  • Use S3 with AWS Lambda
  • Receive events when a new S3 operation occurs

 

These are just some of the things you can do using S3 storage. So it’s a great starting point. There are other storage options in AWS such as

  • Glacier (archive / slow moving data storage)
  • EFS (file system)
  • Storage Gateway

I will probably cover some of these in future posts too, but for now lets stick to what this post will cover which is standard S3.

 

Initial setup

If you did not read the very first part of this series of posts, I urge you to go and read that one now as it shows you how to get started with AWS, and create an IAM user : https://sachabarbs.wordpress.com/2018/08/30/aws-initial-setup/

 

Where is the code?

The code for this post can be found here in GitHub : https://github.com/sachabarber/AWS/tree/master/Storage/S3BucketsAndKeys

 

Ok so how does S3 work?

S3 has a concept of a bucket, which is a top level entity. You may have multiple buckets, each which may have metadata, public/private permissions, auto encryptions etc etc enabled on it. Each bucket contains your files that you have uploaded. Conceptually there is not a lot more to it.

 

IAM user privileges needed for S3

You will need to add these permissions to your IAM user to allow them to use S3

 

  • AmazonS3FullAccess

 

Obviously if you are working in a team you will not want to give out full access, but for this series of posts this is fine.

 

 

So what will we try and cover in this post?

So to my mind what I am trying to cover is all/most of the stuff I would like to do had I have been using Azure Blob Storage. To this end this is the list of things I will cover in this post.

 

  • List all buckets
  • Create a bucket
  • Write an object to a bucket
  • Write an object to a bucket with a pre-check to see if it exists
  • Write a Stream to a bucket
  • Read an object from a bucket
  • Delete an object from a bucket
  • List all objects in a bucket

 

Install the nugets

So lets start. The first thing we need to do is install the Nuget packages, which for this demo are

 

  • AWSSDK.S3
  • Nito.AsyncEx (nice async await extensions, like awaitable Console apps)

 

Ok now that we have that in place and we know (thanks to the 1st post about how to use the default Profile which is linked to the IAM user for this demo series), we can just go through the items on the list above one by one

 

List all buckets

var client = new AmazonS3Client();
ListBucketsResponse response = await client.ListBucketsAsync();
foreach (S3Bucket bucket in response.Buckets)
{
    Console.WriteLine("You own Bucket with name: {0}", bucket.BucketName);
}

 

Create a bucket

This example shows you how to make the bucket public/private

if (client.DoesS3BucketExist(bucketToCreate))
{
    Console.WriteLine($"{bucketToCreate} already exists, skipping this step");
}

PutBucketRequest putBucketRequest = new PutBucketRequest()
{
    BucketName = bucketToCreate,
    BucketRegion = S3Region.EUW2,
    CannedACL = isPublic ? S3CannedACL.PublicRead : S3CannedACL.Private
};
var response = await client.PutBucketAsync(putBucketRequest);

 

Write an object to a bucket

This example also shows you how to attach metadata to an object in a bucket

var client = new AmazonS3Client();

// simple object put
PutObjectRequest request = new PutObjectRequest()
{
    ContentBody = "this is a test",
    BucketName = bucketName,
    Key = keyName
};

PutObjectResponse response = await client.PutObjectAsync(request);

// put a more complex object with some metadata and http headers.
PutObjectRequest titledRequest = new PutObjectRequest()
{
    BucketName = bucketName,
    Key = keyName
};
titledRequest.Metadata.Add("title", "the title");

await client.PutObjectAsync(titledRequest);

 

 

Write an object to a bucket with a pre-check to see if it exists

var client = new AmazonS3Client();

if (!S3FileExists(bucketName, uniqueKeyName))
{

    // simple object put
    Console.WriteLine($"Adding file {uniqueKeyName}");
    PutObjectRequest request = new PutObjectRequest()
    {
        ContentBody = "this is a test",
        BucketName = bucketName,
        Key = uniqueKeyName
    };

    PutObjectResponse response = await client.PutObjectAsync(request);
}
else
{
    Console.WriteLine($"File {uniqueKeyName} existed");
}

....
....
....



bool S3FileExists(string bucketName, string keyName)
{
    var s3FileInfo = new Amazon.S3.IO.S3FileInfo(client, bucketName, keyName);
    return s3FileInfo.Exists;
}

 

Write a Stream to a bucket

There are plenty of times when you want to upload/download data as Streams, as you don’t want to take the memory hit of loading it all into memory in one go. For this reason it is a good idea to work with Stream objects. Here is an upload example using the TransferUtility which is something we might look at in another dedicated post

var client = new AmazonS3Client();

//sly inner function
async Task<Stream> GenerateStreamFromStringAsync(string s)
{
    var stream = new MemoryStream();
    var writer = new StreamWriter(stream);
    await writer.WriteAsync(s);
    await writer.FlushAsync();
    stream.Position = 0;
    return stream;
}

var bucketToCreate = $"public-{bucketName}";
await CreateABucketAsync(bucketToCreate);

var fileTransferUtility = new TransferUtility(client);
var fileName = Guid.NewGuid().ToString("N");
using (var streamToUpload = await GenerateStreamFromStringAsync("some random string contents"))
{
    var uploadRequest = new TransferUtilityUploadRequest()
    {
        InputStream = streamToUpload,
        Key = $"{fileName}.txt",
        BucketName = bucketToCreate,
        CannedACL = S3CannedACL.PublicRead
    };

    await fileTransferUtility.UploadAsync(uploadRequest);
}
Console.WriteLine($"Upload using stream to file '{fileName}' completed");

 

 

Read an object from a bucket

var client = new AmazonS3Client();

GetObjectRequest request = new GetObjectRequest()
{
    BucketName = bucketName,
    Key = keyName
};

using (GetObjectResponse response = await client.GetObjectAsync(request))
{
    string title = response.Metadata["x-amz-meta-title"];
    Console.WriteLine("The object's title is {0}", title);
    string dest = Path.Combine(Environment.GetFolderPath(Environment.SpecialFolder.Desktop), keyName);
    if (!File.Exists(dest))
    {
        await response.WriteResponseStreamToFileAsync(dest, true, CancellationToken.None);
    }
}

 

 

Delete an object from a bucket

var client = new AmazonS3Client();

DeleteObjectRequest request = new DeleteObjectRequest()
{
    BucketName = bucketName,
    Key = keyName
};

await client.DeleteObjectAsync(request);

 

List all objects in a bucket

var client = new AmazonS3Client();

ListObjectsRequest request = new ListObjectsRequest();
request.BucketName = bucketName;
ListObjectsResponse response = await client.ListObjectsAsync(request);
foreach (S3Object entry in response.S3Objects)
{
    Console.WriteLine("key = {0} size = {1}", entry.Key, entry.Size);
}

// list only things starting with "foo"
request.Prefix = "foo";
response = await client.ListObjectsAsync(request);
foreach (S3Object entry in response.S3Objects)
{
    Console.WriteLine("key = {0} size = {1}", entry.Key, entry.Size);
}

// list only things that come after "bar" alphabetically
request.Prefix = null;
request.Marker = "bar";
response = await client.ListObjectsAsync(request);
foreach (S3Object entry in response.S3Objects)
{
    Console.WriteLine("key = {0} size = {1}", entry.Key, entry.Size);
}

// only list 3 things
request.Prefix = null;
request.Marker = null;
request.MaxKeys = 3;
response = await client.ListObjectsAsync(request);
foreach (S3Object entry in response.S3Objects)
{
    Console.WriteLine("key = {0} size = {1}", entry.Key, entry.Size);
}

 

 

After running the demo code associated with this article we should see something like this

 

image

 

 

Conclusion

I think I have shown the most common things you may want to do with AWS s3. We will 100% be looking at some of the integration points with other services in the future. I hope I have also shown this stuff is not that hard, and the APIs are quite intuitive

Leave a comment