Automate AWS Aurora cluster start and stop triggered by time
2019年01月30日
Hello, every body. I am Leo Du, a builder on AWS and Google Cloud. Today I will share how I reduce Aurora database cost by 30 percent (compared to no-upfront RI), and by 22 percent (compared to Aurora Serverless). The comparison is made based on the same instance type, size and region conditions.
Before I move to the detailed design part, first let me list the objectives of the design and the deliverable.
The architecture should feature secure, cost-efficient, easy for maintenance and etc..
For functionalities, the design should make it available for adjustment of the start and stop time of Aurora cluster. This includes but not limited to, able to operate Aurora cluster start and stop, have a control precision of minute-level, be flexible to specify time-points based on different conditions, the actions should be fully automated.
The consideration is that anything could change in the future, and thus the commitment of cloud resources is not an option here.
Last but not least, the implementation of this design should not introduce too much additional cost.
*
To meet the requirements, we utilize System Manager - Automation, CloudWatch Events and IAM.
To meet security requirements, this design introduces IAM features, including but not limited to, assuming roles, passing roles to grant access between different service components. And as always, I leverage policies to define permissions, and meanwhile narrow down the access scope to the least privilege level.
To simplify maintenance, the architecture of the deliverable should minimize the maintenance tasks related to underlying infrastructure. Both System Manager - Automation and CloudWatch Events are serverless, and there is no need to maintain the underlying infrastructure.
To make it possible to operate Aurora cluster, especially for cluster start and stop. Specifically, the deliverable should be able to call Aurora APIs, namely StartCluster and StopCluster. The System Manager - Automation supports job definition which supports these two APIs.
Also, the deliverable should have a time precision of minute-level, and flexible to support different conditions. CloudWatch Events support cron format, which is enough flexible and able to define minute level time points.
To make the deliverable fully automated, we let CloudWatch Events to trigger actions to start or stop Aurora cluster. The Aurora start or stop is defined in System Manager - Automation, which is also responsible for executing pre-defined actions.
Create another Automation Document to define how to stop an Aurora cluster. Name the document name as "my-AWS-StopRdsCluster". Below is the content of this document.
*
Create a role for the CloudWatch Event, which is for starting up an Aurora Cluster, to assume and to execute SSM Automation actions.
Below is the policy of this role. Name the role as "role_CweInvokeSsmAutomation_StartAurora".
Create a role for the CloudWatch Event, which is for stopping an Aurora Cluster.
Below is the policy of role. Name the role as "role_CweInvokeSsmAutomation_StopAurora".
Create a new IAM role that allows System Manager - Automation to perform the actions on your behalf. Name the role as "role_cwe_startStopRds". Create a new inline policy shown as below. Also, attach IAM policies "CloudWatchEventsBuiltInTargetExecutionAccess" and "CloudWatchEventsInvocationAccess" to this role.
*
For the CloudWatch Events to stop Aurora cluster, below table lists necessary configuration.
Based on the above parameters, the Aurora cluster will produce 15 hours of billing everyday. Below shows a comparison of annual Aurora MySQL cost, with instance type and size being "db.t2.small", based on Seoul region. As AWS price may change over time, I need to mention that the price is obtained on Jan. 31, 2019. It is observed that there is about 30% of cost reduction compared to no-upfront RI.
Due to the smallest size of Aurora Serverless is 2 vCPU and 4 GB of memory, here we will compare based on t2.medium instance type and size, to make the comparison even. It is observed that there is about 22% cost reduction.
With the aforementioned configurations being set, the Aurora cluster should be able to automatically start and stop as per the cron settings.
Nota bene
When using CloudWatch Events to trigger System Manager - Automation execution, you may observe a latency between the time you set via cron expression and the status update in the Aurora console. However, if you check the Aurora Events and compare to the cron expression, you will figure out an Aurora cluster of type and size being "db.t2.small" will start around 8 minutes after the time point specified in the cron expression.
Background
After my Blog's database migrated to Aurora database, its performance has been enhanced a lot. Now, I'm focusing on reducing the Aurora database cost.Hello, every body. I am Leo Du, a builder on AWS and Google Cloud. Today I will share how I reduce Aurora database cost by 30 percent (compared to no-upfront RI), and by 22 percent (compared to Aurora Serverless). The comparison is made based on the same instance type, size and region conditions.
Objective
The overall objective to reduce the Aurora cost. Because the database serves my personal Blog, I can choose to stop the Aurora cluster during night to avoid wasting money.Before I move to the detailed design part, first let me list the objectives of the design and the deliverable.
The architecture should feature secure, cost-efficient, easy for maintenance and etc..
For functionalities, the design should make it available for adjustment of the start and stop time of Aurora cluster. This includes but not limited to, able to operate Aurora cluster start and stop, have a control precision of minute-level, be flexible to specify time-points based on different conditions, the actions should be fully automated.
The consideration is that anything could change in the future, and thus the commitment of cloud resources is not an option here.
Last but not least, the implementation of this design should not introduce too much additional cost.
*
Constraint
Long time ago, when the Blog application is hosted on RDS MySQL, it uses AWS Instance Scheduler (version 2.2.2.0) to control start and stop of EC2 and RDS MySQL instances. But when it comes to Aurora cluster, that solution of that version does not support actions against Aurora cluster. So, I have to find a replacement solution to start and stop Aurora cluster.Solution
To meet the aforementioned requirements, the design uses a combination of System Manager - Automation, CloudWatch Events, as well as IAM. An overall architecture is illustrated as below.Detailed Solution
To meet the requirements, we utilize System Manager - Automation, CloudWatch Events and IAM.
To meet security requirements, this design introduces IAM features, including but not limited to, assuming roles, passing roles to grant access between different service components. And as always, I leverage policies to define permissions, and meanwhile narrow down the access scope to the least privilege level.
To simplify maintenance, the architecture of the deliverable should minimize the maintenance tasks related to underlying infrastructure. Both System Manager - Automation and CloudWatch Events are serverless, and there is no need to maintain the underlying infrastructure.
To make it possible to operate Aurora cluster, especially for cluster start and stop. Specifically, the deliverable should be able to call Aurora APIs, namely StartCluster and StopCluster. The System Manager - Automation supports job definition which supports these two APIs.
Also, the deliverable should have a time precision of minute-level, and flexible to support different conditions. CloudWatch Events support cron format, which is enough flexible and able to define minute level time points.
To make the deliverable fully automated, we let CloudWatch Events to trigger actions to start or stop Aurora cluster. The Aurora start or stop is defined in System Manager - Automation, which is also responsible for executing pre-defined actions.
Implementation
System Manager - Automation
Create a new Automation Document in System Manager. It defines how to start up an Aurora cluster. Name the document name as "my-AWS-StartRdsCluster". Below is the content of this document.--- description: Start RDS Cluster schemaVersion: "0.3" assumeRole: "{{ AutomationAssumeRole }}" parameters: ClusterId: type: String description: (Required) RDS Cluster Id to start AutomationAssumeRole: type: String description: (Optional) The ARN of the role that allows Automation to perform the actions on your behalf. default: "" mainSteps: - name: AssertNotStartingOrAvailable action: aws:assertAwsResourceProperty isCritical: false onFailure: step:StartCluster nextStep: CheckStart inputs: Service: rds Api: DescribeDBClusters DBClusterIdentifier: "{{ClusterId}}" PropertySelector: "$.DBClusters[0].Status" DesiredValues: ["available", "starting"] - name: StartCluster action: aws:executeAwsApi inputs: Service: rds Api: StartDBCluster DBClusterIdentifier: "{{ClusterId}}" - name: CheckStart action: aws:waitForAwsResourceProperty onFailure: Abort maxAttempts: 10 timeoutSeconds: 600 inputs: Service: rds Api: DescribeDBClusters DBClusterIdentifier: "{{ClusterId}}" PropertySelector: "$.DBClusters[0].Status" DesiredValues: ["available"] isEnd: true ...Set this document version as default.
Create another Automation Document to define how to stop an Aurora cluster. Name the document name as "my-AWS-StopRdsCluster". Below is the content of this document.
--- description: Stop RDS Cluster schemaVersion: "0.3" assumeRole: "{{ AutomationAssumeRole }}" parameters: ClusterId: type: String description: (Required) RDS Cluster Id to stop AutomationAssumeRole: type: String description: (Optional) The ARN of the role that allows Automation to perform the actions on your behalf. default: "" mainSteps: - name: AssertNotStopped action: aws:assertAwsResourceProperty isCritical: false onFailure: step:StopCluster nextStep: CheckStop inputs: Service: rds Api: DescribeDBClusters DBClusterIdentifier: "{{ClusterId}}" PropertySelector: "$.DBClusters[0].Status" DesiredValues: ["stopped", "stopping"] - name: StopCluster action: aws:executeAwsApi inputs: Service: rds Api: StopDBCluster DBClusterIdentifier: "{{ClusterId}}" - name: CheckStop action: aws:waitForAwsResourceProperty onFailure: Abort maxAttempts: 10 timeoutSeconds: 600 inputs: Service: rds Api: DescribeDBClusters DBClusterIdentifier: "{{ClusterId}}" PropertySelector: "$.DBClusters[0].Status" DesiredValues: ["stopped"] ...Set this document version as default.
*
IAM
CloudWatch Events needs to assume a role, and then uses the granted permissions of that role to execute SSM Automation with the Automation document and parameters (Aurora cluster ID and the role for System Manager - Automation to assume).Create a role for the CloudWatch Event, which is for starting up an Aurora Cluster, to assume and to execute SSM Automation actions.
Below is the policy of this role. Name the role as "role_CweInvokeSsmAutomation_StartAurora".
{ "Version": "2012-10-17", "Statement": [ { "Action": "ssm:StartAutomationExecution", "Effect": "Allow", "Resource": [ "arn:aws:ssm:<AwsRegionId>:<AwsAccountId>:automation-definition/my-AWS-StartRdsCluster:$DEFAULT" ] }, { "Effect": "Allow", "Action": [ "iam:PassRole" ], "Resource": "arn:aws:iam::<AwsAccountId>:role/role_cwe_startStopRds", "Condition": { "StringLikeIfExists": { "iam:PassedToService": "ssm.amazonaws.com" } } } ] }Its trust policy:
{ "Version": "2012-10-17", "Statement": [ { "Effect": "Allow", "Principal": { "Service": "events.amazonaws.com" }, "Action": "sts:AssumeRole" } ] }*
Create a role for the CloudWatch Event, which is for stopping an Aurora Cluster.
Below is the policy of role. Name the role as "role_CweInvokeSsmAutomation_StopAurora".
{ "Version": "2012-10-17", "Statement": [ { "Action": "ssm:StartAutomationExecution", "Effect": "Allow", "Resource": [ "arn:aws:ssm:<AwsRegionId>:<AwsAccountId>:automation-definition/my-AWS-StopRdsCluster:$DEFAULT" ] }, { "Effect": "Allow", "Action": [ "iam:PassRole" ], "Resource": "arn:aws:iam::<AwsAccountId>:role/role_cwe_startStopRds", "Condition": { "StringLikeIfExists": { "iam:PassedToService": "ssm.amazonaws.com" } } } ] }Its trust policy:
{ "Version": "2012-10-17", "Statement": [ { "Effect": "Allow", "Principal": { "Service": "events.amazonaws.com" }, "Action": "sts:AssumeRole" } ] }
Create a new IAM role that allows System Manager - Automation to perform the actions on your behalf. Name the role as "role_cwe_startStopRds". Create a new inline policy shown as below. Also, attach IAM policies "CloudWatchEventsBuiltInTargetExecutionAccess" and "CloudWatchEventsInvocationAccess" to this role.
{ "Version": "2012-10-17", "Statement": [ { "Effect": "Allow", "Action": [ "rds:StartDBCluster", "rds:StopDBCluster", "rds:StopDBInstance", "rds:StartDBInstance", "rds:DescribeDBClusters" ], "Resource": "*" } ] }Its trust policy:
{ "Version": "2012-10-17", "Statement": [ { "Sid": "", "Effect": "Allow", "Principal": { "Service": "ssm.amazonaws.com" }, "Action": "sts:AssumeRole" } ] }
CloudWatch Events
For the CloudWatch Events to start Aurora cluster, below table lists necessary configuration.Value | Description | |
Name | ScheduleStartRdsCluster | |
Type | Time-base event | |
Cron expression | 0 23 * * ? * | every day at 7:00 (GMT+8) |
Target | SSM Automation Documentation ("my-AWS-StopRdsCluster") | |
AutomationAssumeRole | ARN of role "role_cwe_startStopRds" | |
Role | role_CweInvokeSsmAutomation_StartAurora | |
Aurora cluster ID | The Aurora cluster ID in your environment |
For the CloudWatch Events to stop Aurora cluster, below table lists necessary configuration.
Value | Description | |
Name | ScheduleStopRdsCluster | |
Type | Time-base event | |
Cron expression | 0 14 * * ? * | every day at 22:00 (GMT+8) |
Target | SSM Automation Documentation ("my-AWS-StopRdsCluster") | |
AutomationAssumeRole | ARN of role "role_cwe_startStopRds" | |
Role | role_CweInvokeSsmAutomation_StopAurora | |
Aurora cluster ID | The Aurora cluster ID in your environment |
Based on the above parameters, the Aurora cluster will produce 15 hours of billing everyday. Below shows a comparison of annual Aurora MySQL cost, with instance type and size being "db.t2.small", based on Seoul region. As AWS price may change over time, I need to mention that the price is obtained on Jan. 31, 2019. It is observed that there is about 30% of cost reduction compared to no-upfront RI.
Annual cost | |
OD with 15 hours daily running time | $344.93 |
RI - No Upfront | $490.56 |
RI - Partial Upfront | $415.24 |
RI - All Upfront | $407.00 |
Due to the smallest size of Aurora Serverless is 2 vCPU and 4 GB of memory, here we will compare based on t2.medium instance type and size, to make the comparison even. It is observed that there is about 22% cost reduction.
Annual cost | |
OD with 15 hours daily running time | $684.38 |
Serverless | $876.00 |
With the aforementioned configurations being set, the Aurora cluster should be able to automatically start and stop as per the cron settings.
Nota bene
When using CloudWatch Events to trigger System Manager - Automation execution, you may observe a latency between the time you set via cron expression and the status update in the Aurora console. However, if you check the Aurora Events and compare to the cron expression, you will figure out an Aurora cluster of type and size being "db.t2.small" will start around 8 minutes after the time point specified in the cron expression.
GMT 04:02 scheduled to start in CloudWatch Events GMT 04:10 RDS cluster is totally available