PERF 4: How do you select your database solution?
The optimal database solution for a system varies based on requirements for availability, consistency, partition tolerance, latency, durability, scalability, and query capability. Many systems use different database solutions for various subsystems and enable different features to improve performance. Selecting the wrong database solution and features for a system can lead to lower performance efficiency.
Resources
AWS purpose-built databases (DAT209-L)
Amazon Aurora storage demystified: How it all works (DAT309-R)
Amazon DynamoDB deep dive: Advanced design patterns (DAT403-R1)
Cloud Databases with AWS
AWS Database Caching
Amazon DynamoDB Accelerator
Amazon Aurora best practices
Amazon Redshift performance
Amazon Athena top 10 performance tips
Amazon Redshift Spectrum best practices
Amazon DynamoDB best practices
Best Practices:
-
Understand data characteristics: Understand the different characteristics of data in your workload. Determine if the workload requires transactions, how it interacts with data, and what its performance demands are. Use this data to select the best performing database approach for your workload (for example, relational databases, NoSQL Key-value, document, wide column, graph, time series, or in-memory storage).
-
Evaluate the available options: Evaluate the services and storage options that are available as part of the selection process for your workload's storage mechanisms. Understand how, and when, to use a given service or system for data storage. Learn about available configuration options that can optimize database performance or efficiency, such as provisioned IOPs, memory and compute resources, and caching.
-
Collect and record database performance metrics: Use tools, libraries, and systems that record performance measurements related to database performance. For example, measure transactions per second, slow queries, or system latency introduced when accessing the database. Use this data to understand the performance of your database systems.
-
Choose data storage based on access patterns: Use the access patterns of the workload to decide which services and technologies to use. For example, utilize a relational database for workloads that require transactions, or a key-value store that provides higher throughput but is eventually consistent where applicable.
-
Optimize data storage based on access patterns and metrics: Use performance characteristics and access patterns that optimize how data is stored or queried to achieve the best possible performance. Measure how optimizations such as indexing, key distribution, data warehouse design, or caching strategies impact system performance or overall efficiency.
Improvement Plan
Understand data characteristics
Evaluate the available options
Collect and record database performance metrics
Choose data storage based on access patterns
Optimize data storage based on access patterns and metrics