Topics

Anahita Project

Anahita Project's Topics

Scott Crawford

Scott Crawford

July 23 2019

Revising Anahita's S3 Storage Plugin

Starting a topic on the S3 storage plugin to document the upgrades that appear to be required.  I'll use the topic itself to post relevant background and then post questions I have at the moment as comments on the topic.

Here is an AWS source document I've found describing recent changes:

https://docs.aws.amazon.com/general/latest/gr/sigv4_changes.html

Key points:

Previous method was referred to as Signature Version 2, the new method is referred to as Signature Version 4.

The original s3lib.php file includes a link in the header comments that is no longer functional.  In researching an updated version of the file, I've found two sources on GitHub:

https://github.com/racklin/amazon-s3-php-class

and

https://github.com/tpyo/amazon-s3-php-class

According to an article I found on Medium, the fork maintained by Rack Lin is preferable:

https://medium.com/@martindrapeau/simple-php-code-to-push-files-to-aws-s3-3396f9b3d02a

So I'm using the Rack Lin fork as the starting point for the updates.  Copying down the library file ('S3.php') I found a few extra spaces which once deleted helps focus strictly on what's changed by running a diff against Anahita's 's3lib.php' file.

What's new - variables:

  • $region
  • $progressFunction
  • $signVer

What's new - functions:

  • setRegion()
  • getRegion()
  • setSignatureVersion()
  • setProgressFunction()
  • __getSignatureV4()

What changed - functions:

  • __construct() now includes $region
  • putBucket() now calls getRegion()
  • inputFile() now includes sha256sum encoding
  • putObject() now includes sha256sum encoding
  • __getCloudFrontDistributionConfigXML() now checks that $trustedSigners is not empty
  • getResponse() now checks $signVer for 'v2' versus 'v4'
  • getResponse() now includes the $progressFunction

Anahita's PlgStorageS3 class:

A little intro on Anahita's PlgStorageS3 class, then the questions to follow...

So Anahita's PlgStorageS3 class appears to need revisions in order to account for the signature version and region.  It seems these should be parameters that can be assigned in the Plugin's settings through additions to the s3.json file, with the signature version as a drop-down selector, and the region a text field with the default value of 'us-east-1'.

The "core" functions in PlgStorageS3 (_read, _write, _exists, and _delete) appear they should be compatible with the Rack Lin replacement for the s3lib.php file.

Protected variables would need to be added for $_region and $_signature, and it appears the __construct function would need to be amended to account for both.

Scott Crawford
Scott Crawford
July 23 2019 Permalink
So, questions:

(1) The _initialize function includes bucket and ssl. The function's header comments indicate this is used to instantiate the object from __construct. Does this indicate the _initialize function should also be amended to include 'region' and 'signVer' within the array?

(2) What specifically is the _getURL() function performing? I can see the URL construction needs to be revised, but is this being used only for loading stored objects when building the front-end?
Rastin Mehr liked this
I wanted to let you know how much I appreciate this!
Unknown Person liked this
Scott Crawford
Scott Crawford
3 weeks ago Permalink
May need a little help on revising the S3 plugin. So far I've amended the .json settings to also include signature version and region, which are both confirmed as being stored in the database. I've also replaced the s3lib.php file with the "Rack Lin" version mentioned above. Last, I've amended the s3.php file's __construct, _initialize, and _getUrl functions to account for the new connection variables.

So far though I haven't been able to successfully make the connection work. My main focus at this point is the s3.php file's __construct function, which I'm trying to pass parameters based on the s3lib.php file's __construct function:

https://github.com/wscrawford/anahita/blob/494f851e31f887d77af7ab079dd33a0a32a20816/src/plugins/storage/s3.php#L67

https://github.com/wscrawford/anahita/blob/494f851e31f887d77af7ab079dd33a0a32a20816/src/plugins/storage/s3lib.php#L229

Revising the s3.php file's _initialize function seems like it should be pretty straightforward and work as it currently stands, and the _getUrl function I have confirmed is rendering the paths correctly when accounting for V2 or V4 signature as well as SSL.

Am I picking up the parameters correctly in the revisions to the s3.php __construct function?
I think the code you have written in __construct need to move to _initialize method. __construct only receives the custom configurations, if there are any. In the _initialize you'll check to see if there are any custom values passed in the $config object, otherwise use the ones provided by the plugin parameters or default hardcoded values.
Scott Crawford
Scott Crawford
3 weeks ago Permalink
So this would need to be different than in the original / current s3.php?

https://github.com/anahitasocial/anahita/blob/297d08977a9269ff9a29a879aadc95f2f4865e5c/src/plugins/storage/s3.php#L45

Here, isn't __construct defining a new S3 object based on the configurations it's first picking up (_bucket and _ssl) ? It seemed like we just needed to add the new parameters (signature version and region).

I can see where the endpoint could be defined in _initialize, but it's not registering why the other parameters wouldn't need to be handled as in the current plugin.
Scott Crawford
Scott Crawford
3 weeks ago Permalink
It did appear to me that signature version is being recognized in the _getUrl where I'm testing for $this->_signature in constructing the paths.
Ok, I was wrong. I checked the order of execution. The _initialize gets called first, so you need to set the $config default values in the _initialize method. Then the constructor is called, in __construct if a $config object is passed to the function, then use it. Perhaps it is even more comfortable if you merge it, so the fields that contain a value overwrite the default values. Otherwise, use the default values that you set in the _initialize function.
Scott Crawford
Scott Crawford
3 weeks ago Permalink
Where the new S3 object is created, it's required to pass the parameters in the correct order as the library's S3 has them defined, right? But, their function defines $endpoint so would I still need to pass the endpoint value to the new _s3 object? Or would I just skip the endpoint in passing values to the library's S3 object?

https://github.com/wscrawford/anahita/blob/494f851e31f887d77af7ab079dd33a0a32a20816/src/plugins/storage/s3lib.php#L229

Not sure if that makes sense... attempting to rephrase, the library's S3 object is taking in 5 values (accessKey, secretKey, useSSL, endpoint, & region) but the 4th is defined by the function... so if I pass just 4 values (accessKey, secretKey, useSSL, & region) would the region value be mis-aligned to what the function expects? Or would I still need to pass endpoint even through it's defined in the library's function call?

Obviously I'm rusty on this, likely as well in my ability to describe:)
Create the S3 object in the plugin's __construct method because at that point; all the parameters are initialized previously in the _initialize method. When you are passing parameters to the s3lib object, they need to be in the correct order as the s3lib's __construct method.

Yes, the other methods in this plugin pass the required values to the S3 object or return a value received from the s3 object. This plugin is an adaptor to make the S3 library work with Anahita.

Did I answer your questions?
Scott Crawford
Scott Crawford
3 weeks ago Permalink
Mostly - thanks - but I'm still trying to understand if I need to pass the endpoint.

The library's __construct is taking in 5 values: (1) accessKey, (2) secretKey, (3) useSSL, (4) endpoint, and (5) region:

https://github.com/wscrawford/anahita/blob/494f851e31f887d77af7ab079dd33a0a32a20816/src/plugins/storage/s3lib.php#L229

But, within the library's __construct function the endpoint value is set equal to 's3.amazonaws.com'.

In my revised plugin __construct, I'm currently passing the same 5 values in order. But since the library's __construct defines the endpoint's value, should this be omitted from the plugin's __construct?

https://github.com/wscrawford/anahita/blob/9b6cdb62d4cd12c5227d422209f5cb19b2cad7ea/src/plugins/storage/s3.php#L76

Or by omitting _endpoint from the plugin's __construct, does that throw off the ordering of the values being received by the library's __construct?

I.e.:

Expected:
1. accessKey
2. secretKey
3. useSSL
4. endpoint
5. region

Received:
1. accessKey
2. secretKey
3. useSSL
4. region -- is this being "received" as "endpoint" ?
yes the order of the passed parameters is important. If you are passing all of them, just pass 's3.amazonaws.com' for the endpoint. That should happen in the _initialize method where you are adding default values.
Unknown Person liked this

Powered by Anahita