If you have a limited dataset and the dataset grows by a small amount, you can use only a single primary shard with a replica. If your dataset is not limited and grows by a large amount, the optimal number of shards is dependent on the target number of nodes.
Actually, a single node can be sufficient for many simple use cases, but to reduce the fault tolerance when considering the nature of distributed architecture and to prevent data loss, you can use more than one node. So, we need to find the answer to the first question: How many nodes will work?
Even to answer this question, we need to find out the answers to a few questions. For example: Do we need to use the non-data node? If we don't need to use non-data nodes, considering the Elasticsearch shard allocation policy, we can say that a node requires at least one shard to be the data node - as well as a replica. In that case, we can follow the following formula:
Max number of data nodes ...