Some of AWS client calls provide responses with the limited amount of data (typically 1.000 items per response).
Example response may look as follows:
aws_client.list_objects_v2(bucket: bucket)
=> #<struct Aws::S3::Types::ListObjectsV2Output
is_truncated=true,
contents=
[#<struct Aws::S3::Types::Object
key="reports/report_2.csv",
last_modified=2019-03-13 14:25:04 UTC,
etag="\"5a7c05eb47dcd13a27a26d34eb13b0ec\"",
size=466,
storage_class="STANDARD",
owner=nil>,
...
]
name="awesome-bucket",
prefix="",
delimiter=nil,
max_keys=1000,
common_prefixes=[],
encoding_type=nil,
key_count=1000,
continuation_token=nil,
next_continuation_token="1wEBwtqJOGmZF5DXgu5UhTMv386wdtND0EQzkkOUEGPPeF8tC58BEbfBvfsVHKGnxNgHxvFARrcWdCPJXXgiMzUtpedrxZP2G9wu/0but8ALLHDGdZVD4OHb41DWQKocGGAOwr0wfOeN4hUoCzimKeA==",
start_after=nil>
Because list_objects_v2
method takes continuation_token
as an argument, one of the solutions to fetch all the records may be to loop through the responses using next_continuation_token
until the next_continuation_token
field is empty.
Instead, you can use the built-in enumerator in the response object, which will return results from all the pages (next pages will be fetched automatically by SDK):
aws_client.list_objects_v2(bucket: bucket).map { |page| page[:contents] }
=> [[#<struct Aws::S3::Types::Object
key="reports/report_2.csv",
last_modified=2019-03-13 14:25:04 UTC,
etag="\"5a7c05eb47dcd13a27a26d34eb13b0ec\"",
size=466,
storage_class="STANDARD",
owner=nil>,
#<struct Aws::S3::Types::Object
key="reports/report_1.csv",
last_modified=2019-03-13 13:43:30 UTC,
etag="\"dc7215c066f62c7ddedef78e123dbc7c\"",
size=191722,
storage_class="STANDARD",
owner=nil>,
... ]
However, there is even simpler solution to achieve the same result. You can use pluck
method as follows:
aws_client.list_objects_v2(bucket: bucket).pluck(:contents)