02 May 2016

Business Logic in database or application

Many would agree that this has been an age old debate with programmers - where do we place the domain’s business logic? I have frequently been pulled into this discussion especially while dealing with legacy application in either enhancing or re-writing them. It has been less common to encounter such an debate with newer developments and teams as I find very few programmers venture close to a complex database programming and rely on frameworks which abstract most of the database from them. SQL programming is now seen more like a handicraft skill.

I would like to base my discussion in the context of enterprise applications, which have a very long life and the only thing constant would be change. I should stress that there are no best practises here, the decisions would vary based on the circumstances and your judgement. In this article I’ll look into the question of placing business logic in a database or within an application’s code. I’ll highlight key points which one should consider and explore further while making this decision. This is mostly an architectural decision and there are no black and white answers here, no perfect solution and one fits all approach.

What is Business Logic anyway?

To state in an overly simplified way - application logic generally applied to some data. These may be validations, logical decisions, transformations or processing data in to information. These may be as simple as a line of code or can span multiple steps which may require use of an orchestration. Sometimes they may also not involve any data storage. They are also referred as domain logic. Now you would want to have most of this in one place(ideally) and not scattered all over.

Database tier for your business logic

Database servers are very capable platforms and one can possibly run a very complex application entirely from a database. You can have your business logic written in stored procedures, views, functions and triggers in SQL. In fact during my early days I have written large multi-layered applications where a considerable amount of business logic was written within stored procedures. Multiple levels of processing would be achieved by using table variables or temp tables and pass the final result back to the calling code to be readily consumed. As you can see a lot of the business logic can be written in SQL.

Application tier for business logic

This would generally be in your domain layer or a designated layer (business layer) in your app which usually coordinates between the storage layer and the application layers. For the benefit of this article I would not deviate by considering how to build this layer. For most of the cases it would be safe to assume the preferred approach would be employing of an OO language. The database would serve only as a data store.

Performance

During my initial days in programming one of my reasons for using stored procedures for business logic was for performance and flexibility. The idea was keeping action close to data and you could modify the business many times without touching the application code. This would reduce a chatty interface and network load by sending smaller payloads. But as it turned out when applications and users grow, too much logic in a database turns out to be a processing bottleneck and maintenance nightmare. You lose out all the advantages you initially started out with when the application and requirements were small.
Application’s performance can be can still be high when business logic sits in your application tier while keeping it in a maintainable state. This can be truly appreciated in large applications. However one should still leverage what SQL is best at, at filtering data and set based operations. Database servers are generally singular in an enterprise and serve multiple users and applications. Scaling out a database is much harder than scaling out your application. Under heavy loads database object locks are the biggest contributors in degrading performance which can be alleviated by moving the processing outside of it. I have observed in databases where there are multiple triggers implementing business logic, especially on a single table, when moved outside vastly improved performance by increasing the number of transaction per second under heavy loads.
If cached query plans is the reason for sticking with stored procedures for business logic, then you can get the same results in SQL Server using sp_executesql from your data layer. Most of the ORM and micro-ORM internally promote query plan caching.
Minimise the trips to your database and reduce the number of data returned and the application tier should give a good performance and also make the maintenance easier. This would be more evident when your applications and database begin to grow. Most of the enterprise applications undergo a lot of churning, focusing on maintainability should be the primary goal.

Maintainability

One of the biggest pit falls of having database as your business layer is in its maintainability. There may be arguments out there that if done right and if there is a formalised approach, programming in T-SQL would be as maintainable. SQL is designed to only query/modify the data in a database. T-SQL adds extension to this and gives it procedural language capability. But reuse and abstraction of an implementation is hardly supported.
Let’s consider that we have a business case implemented entirely inside a stored procedure. You would definitely be having parameters being passed in and result out from this stored procedure. It is safe to assume that many clients(applications) utilise this stored procedure. Out of blue there is a change request on one of the applications. If this involves changing any of the parameters of the stored procedure, it would break all the applications which are referring it. In an enterprise this is no cakewalk. Extensibility is lost and keeping it DRY(Don’t Repeat Yourself) is very difficult. When programmers are faced with such enhancements, the path of least resistance is to make a copy of the stored procedure while applying the required changes and use it within the concerning app so that rest of the relying apps are not broken.
Cyclomatic complexity and LoC (Lines of Code) is a good measure to indicate how maintainable the code is. In OO languages there are numerous design patterns which provide a ubiquitous language which can guide any new team member in refactoring a given code. A stored procedure or database trigger can run in to 100+ LoC easily and it is always a troublesome experience to maintain or troubleshoot even for the original author of that code in T-SQL. Refactoring is not intuitive when dealing with T-SQL.
On how many occasions does an organisation require a new team member to work on an existing application? In my opinion the barrier of entry for a new team member is comparatively less when the business logic is in its own layer in the application tier.

I am not familiar with any testing framework which we can use to unit test database logic implementations so I will refrain from commenting on testability of SQL code. But we have a mature market out there to test your application code.

Reusability

In any environment reusability improves code maintainability and brings down the overall cost of building an application. When we are faced with a new requirement which might be a slight modification of an existing code, it is all too tempting to copy and paste the code with slight modifications to meet the new requirement.
When you follow OOP, one of its principles is reusability. In fact even in procedural languages, reusability was desirable and could be achieved. With OOP you can target reuse at a very granular level, which is a class or at a coarse level even with growing complexity. Achieving this level of reusability is very difficult in SQL. I am not denying that it is not possible, but not easy. Even at a coarse level it is not that intuitive. Views are a good place to start, but they lack the support of parameters. Moving over to stored procedures, when attempts are made to provide functionality through a stored procedure by generalising it, you would generally see an explosion in the list of parameters in order to accommodate all the applications which intend to use this stored procedure since each may need a slightly different result. At a granular level one would get more mileage by using an inline UDF (User Defined Function) instead of a stored procedure. But be warned, it is not advisable to use scalar UDF due to their poor performance, especially when you include them in multi-row operations.
You can also implement business logic which is global and thereby automatically reused through constraints, triggers or unique filtered indexes. Consider constraints for enforcing data integrity if required and triggers only for requirements which are meant for simple auditing. The problem with triggers with implementing business rules is that they are the ninja in an application. They work in stealth. I have spent countless hours trying to figure out reason behind a certain mysterious behaviour only to find that a trigger was responsible for that.
I would leave you to explore deeper in each of the reusability options and take an informed decision.

Integrity (referential integrity)

Referential integrity (RI) is part of business logic. Traditionally this has always been a part of database and more so because they support it very well. But as complexity of applications increased and more n-tier applications became common, you may realise that most of the integrity rules or validations would need to be repeated in many layers.
Let’s consider an example where we have a rich HTML app using REST services which in turn depends on a shared database. These REST services may also be used by other applications. The HTML app uses javascript which encapsulates business logic along with validations and pushes a JSON to the REST service. This JSON would be transformed in to another business object or domain object and all of the same logic or may be more would need to be applied again at this layer. This when persisted to the database would need to meet the RI checks placed here. Even though we strive for DRY, it is not always possible. We could have the validation restricted to one layer only, but that would only delay the result when data is incorrect or worse persist corrupt data. Each option presents with benefits and trade-offs and there are no set rules here. It depends on the needs and a developer might spread this responsibility across database and the domain object.
Applications which communicate to multiple databases and where a single database is not the only data store, RI checks have to be done in the business objects or your application layer. Once again ideally you would want to have your business logic in a tier which is commonly shared. Databases are often the best place for implementing RI. However placing rest of the business logic in here too might prove costly in maintenance and be a performance inhibitor when the complexity and size increases.

Security

Per se application security does not come under business logic, but it controls which part of business logic user has access to. It is an integral part of any application and business. When you have many applications in an organisation which are serviced by a single central database, maintaining security (authorisation) at the database seems very tempting. After all we can centrally control what a user can access/execute irrespective of apps entering or leaving the ecosystem.
On database objects we can define security at a very granular level. To ease manageability, we can define DB roles and then assign them to DB objects. We can have very capable authorisation through databases. But they have drawbacks and bottlenecks which alone are worth considering to manage authorisation outside of database servers.
In order to utilise database authorisation we need to opt for impersonation using windows authentication. SQL authentication would not be the preferred choice in many organisations, but it poses the same problems while being less secure.

But with impersonation each individual windows account needs to be mapped to the database server. With this one loses scalability due to no connection pooling, so limiting number of users simultaneously accessing the database. It is the client/application which handles connection pooling based on the connection string. When individual window users connect to the DB server, a pool is created per Windows Identity. When you use a service account or a process account to connect to the database, through effective use of connection pooling you can significantly enhance performance and scalability. You would of course have to pass user context with each call to the database to identify the user.

In today’s growing need of interconnected applications and sharing of data, it is often necessary and prudent to know what a user is authorised to do early on. You could store authorisation/membership information of users in a database, which can be queried to build a security context of an user.
In today’s growing need of interconnected applications and sharing of data, it is often necessary and prudent to know what a user is authorised to do early on. You could store authorisation/membership information of users in a database, which can be queried to build a security context of an user.

Conclusion

My personal preference is to contain business logic in the application tier. However I cannot stress enough on how unique each of your business and technical environments would be for one solution to fit all. From my experience in large enterprise applications, most of the times the data is scattered across different sources or across multiple RDBMS servers. Even if they happen to be concentrated in a single database, keeping logic in the database limits your choice on many fronts and depends also on the teams capabilities. Meanwhile it would be foolish to completely ignore the capabilities of SQL and limit to simple CRUD statements. Let your application tier drive your business logic but also utilise SQL capabilities to maximise performance.

Ajeet Yelandur's Blog

Discovering code