What is Spring Data actually?

In my Spring Boot workshop last week one of the participants asked me to clarify what Spring Data actually is. He came from a different background and was confused how Spring Data, JPA, Hibernate and NoSQL relate and what role Spring Data plays. Indeed, if you switch contexts, it may look strange at first. However, it is not, and I write today the same I answered in the workshop.

In the early days, relational databases were the norm and JDBC is the way to access them in Java. However, as it was just basic querying, you had to manually transform your data objects to SQL queries and vice versa. Of course, several object relational mapper aka ORMs were created to solve mapping in a reusable fashion.

However, the Java world loves standards, and thus a specification arose as the Java Persistence API (JPA). It specifies a generic interface for ORMs, but it does not provide an implementation. It is a spec only. Hibernate, and TopLink/EclipseLink are the well known implementations.

When you work with a relational database, this was all good, and you could use the JPA way with Hibernate as the implementation. Your code uses JPA only and you "only" define by configuration which JPA implementation to use and how.

Then came the rise of NoSQL databases and JPA wasn't suitable for that. If your ORM supported more than SQL, you could use a mix of JPA and ORM specifics to access a NoSQL database, mainly MongoDB. If you used another one, you had to code the mapping yourself or use an existing lib. Nonetheless, it wasn't that reusable.

This is where Spring Data steps in and at least offers a solution when you work with the Spring Framework. In essence, it provides a common way to work with a database regardless of its kind while still giving you the possibility to work with specific traits of your data store of choice.

In your Spring application, you work with a Repository interface provided by Spring Data. Your application uses this interface and at this point does not know the underlying data store. You can define basic queries directly in the interface by following a method naming scheme. Or, of course, you can use here also more storage specific queries like the JPQL for JPA. At this point, all your application code interacts with these repositories. What actually runs behind that is out of sight and completely handled by store specific Spring Data modules.

By using Spring Data, you get a store independent way of done-for-you data access and mapping. Yet it still allows you to use store specific features.