Book Image

A Definitive Guide to Apache ShardingSphere

By : Trista Pan, Zhang Liang, Yacine Si Tayeb
Book Image

A Definitive Guide to Apache ShardingSphere

By: Trista Pan, Zhang Liang, Yacine Si Tayeb

Overview of this book

Apache ShardingSphere is a new open source ecosystem for distributed data infrastructures based on pluggability and cloud-native principles that helps enhance your database. This book begins with a quick overview of the main challenges faced by database management systems (DBMSs) in production environments, followed by a brief introduction to the software's kernel concept. After that, using real-world examples of distributed database solutions, elastic scaling, DistSQL, synthetic monitoring, database gateways, and SQL authority and user authentication, you’ll fully understand ShardingSphere's architectural components, how they’re configured and can be plugged into your existing infrastructure, and how to manage your data and applications. You’ll also explore ShardingSphere-JDBC and ShardingSphere-Proxy, the ecosystem’s clients, and how they can work either concurrently or independently to address your needs. You’ll then learn how to customize the plugin platform to define personalized user strategies and manage multiple configurations seamlessly. Finally, the book enables you to get up and running with functional and performance tests for all scenarios. By the end of this book, you’ll be able to build and deploy a customized version of ShardingSphere, addressing the key pain points encountered in your data management infrastructure.
Table of Contents (18 chapters)
1
Section 1: Introducing Apache ShardingSphere
4
Section 2: Apache ShardingSphere Architecture, Installation, and Configuration
10
Section 3: Apache ShardingSphere Real-World Examples, Performance, and Scenario Tests

Database and app online tracing

The distributed tracing system is designed based on the Google Dapper paper. There are already many relatively mature applications of the system, such as Zipkin by Twitter, SkyWalking by Apache, and CAT by Meituan-Dianping.

The following sections will introduce you to how a database and app online tracing work, and then show you how Apache ShardingSphere implements this feature.

How it works

A distributed scheduling chain turns one distributed request into multiple scheduling chains. In the scheduling of one distributed request, such as time consumption on each node, the machine receives the request and the status of the request on each service node can be seen in the backend.

The following is a diagram of one distributed request scheduling chain quoted in Google's paper Dapper, A Large-Scale Distributed Systems Tracing Infrastructure:

Figure 4.8 – A distributed request scheduling chain

Citation

Dapper...