DynamoDB is a serverless NoSQL database service offered by AWS.
Data in a dynamodb table is maintained in the form of key-value pairs, instead of the usual rows columns approach in a SQL database.
A dynamodb table can have one partition key and an optional sort key. A partition key is similar to a primary key – it uniquely identifies a record in a table.
A sort key is also called a range key, it represents the way in which items with the same partition key physically close together, in sorted order by the sort key value.
A dynamodb record can be uniquely identified by its partition key and sort key.
But the problem is, you can only query with the primary key (partition + sort) attribute. If you want to query over a non-key attribute it is not possible with a query operation. To solve this, AWS provides Secondary Indexes which can be used for specific scenarios.
These are –
- Local Secondary Indexes (LSIs)
- Global Secondary Indexes (GSIs)
For example, you want to find out all the albums by an artist in a genre – where genre name is the partition key and album name is the sort key.
Genre (PK) | Album (SK) | Artist | Year |
One approach is to add an additional sort key on Artist, so that you can also query over a second attribute.
You can create a Local Secondary Index
Alternatively, you can create a Local Secondary Index over the Artist Name, so that you can query over all the albums by an artist in a genre. You can also put down a projection, to return all the attributes when queried on an LSI.
Remember, you can only create an LSI when creating the table itself. Also, LSI uses the same provisioned capacity (RCU, WCU) as the base table, so if your queries on LSI take up more resources, you may end up throttling the actual table queries.
But say for example, you want to query all the albums by an artist, across genres. In this case, you have to query over the artist name, which is not the partition key currently configured.
In DynamoDB you cannot query over a non-key attribute, so you will have to scan through all the records of the table and use a filter expression to filter by artist.
This is an inefficient and not a cost effective operation, since you will have to read all the records of the table.
You can create a Global Secondary Index
To solve this, you can create a Global Secondary Index (GSI) over the Artist Name. A Global Secondary Index can be created to query over non-key attributes, which can help avoid scan operations. You will configure the provisioned throughput of GSIs separately, so it doesn’t work with the table RCUs/WCUs. Also, they can be created after a table has been already created.
Each table in DynamoDB can have up to 20 global secondary indexes (default quota) and 5 local secondary indexes. In general, you should use global secondary indexes rather than local secondary indexes.
local vs global secondary indexes
The differences between Local and Global Secondary Indexes are summarized below:
Local Secondary Indexes | Global Secondary Indexes |
---|---|
When need to query on Same Partition Key, but different Sort Keys | When need to query on different attributes other than the Keys |
every partition of a local secondary index is scoped to a base table partition that has the same partition key value | queries on the index can span all of the data in the base table, across all partitions |
Only created when the base table is created | Can be created even after the table is created |
Uses the same RCU and WCU as the base table, can result in throttling if not used properly | Uses different RCU and WCU to be configured when created, doesn’t impact base table |
Since provisioning is dependent on base table, the total size of indexed items for any one partition key value can’t exceed 10 GB | no size limitations |
can have up to 5 local secondary indexes | can have up to 20 global secondary indexes (default quota) |
You can learn more about Secondary Indexes in the official AWS DynamoDB documentation