A S3 bucket will be receiving hundreds of CSV files / minute … I want to insert those CSV lines into a SQL database [most likely using RDS] in real time…
I know I can use S3 events and Lambda for that, but I am concerned about the lambda concurrency limits.
The project can grow on global level with even thousands of files coming in / minute … What is the best design for this, which AWS services to use?
The data to be inserted in SQL will be used to do some balance like calculations.
I am looking for reliability and unlimited scalability.
Thank you
What is the acceptable latency between when the CSV file is inserted into the bucket and when the record is added into RDS? Can you trigger the lambda every minute and process the newly added files?
Another concern you need to think about with Lambda is cost. With thousands of invocations per minute it might be more cost effective to use an EC2 instance
Hi , the csv files can be large (thousands of lines each), but there will be one new file / minute … I believe lambda can handle that. I will use a lambda function to insert the rows in a RDS Aurora database … I have read in the meantime about Timestream and I don’t think the data fits that model … So after all, it will be maximum one invocation per minute actually …