Thursday 11 February 2016

Using DbDeploy in large globally distributed project

Recently we successfully implemented DBDeploy for managing database migrations in a large globally distributed project. It was a huge program with more than 200 developers working in separate teams, globally distributed in seven locations.
The database management was done manually. For every release developers used to prepare release notes, listing all the sql scripts which should be executed for that release.
It was chaotic to manage correct list of sqls to be executed. Worst yet, given an environment and db schema, no one ever knew which sql scripts are run on that schema to take it to current state. This process was definitely not scalable and was one of the biggest bottleneck in making successful releases.
To fix this situation, we decided to introduce DBDeploy for database change management. DBDeploy is a well known database change management tool, which is built around various ideas discussed in the book ‘Refactoring Databases’. Several tools like flyway and liquibase are available now, but we chose DBDeploy because it was easier to introduce (doesn’t need you to write XML like liquibase) and also had a feature to just output consolidated sql file from migrations, instead of executing all the migrations directly against database. This was crucial because DBAs were very keen to review all the changes going into database for every release.
DBDeploy kind of tools is a common practice now, and many people know about standard patterns and practices to use these tools, But when you have several teams working on a same codebase and distributed globally, there are some unique set of challenges you face.  Based on our learnings in last few months, following are some of the things which are crucial to successfully implement DBDeploy in large distributed teams, making changes to same database.


  1. Use Timestamp as a change number
DbDeploy expects each migration to be named with unique change number. Generally a simple integer is used. But when you have globally distributed teams, it’s impossible to use simple integer. When you have several developers distributed across locations, its very hard to know what next integer to use for your script. If the last script in version control system is say 001_create_customer_table.sql, and you add your script as 002_create_address_table.sql, there is high chance that someone has already checked in a script with id 002. Using Timestamp in the format yyyyMMddHHmmss_<scriptname>.sql solves that problem. There is less likelihood for the timestamp with millisecond granularity to collide.


     2.   Write a build task to generate migration scripts with correct timestamp.
Even if you expect everyone to know and follow rules to create correctly named migrations, people make mistakes. Unless there is a quick and easy way to generate timestamps for migrations, most developers will copy paste one of the existing scripts and change one digit in the timestamp value. So make sure you write a gradle or ant task to generate database migration files with correct timestamp format.


The gradle task we wrote is as following
def askTicket() {
def console = System.console()
def ticket
for (;;){
if (console) {
ticket = console.readLine('\n> Please enter your story or jira reference (e.g: DT-002): ')
} else {
logger.error "Cannot get console."
}
if (ticket ==~ /([A-Za-z])+\-[0-9]+/){
break
}
}
return ticket
}


def timestampExists(timestamp, location) {
return false
}


task newMigration {
    doLast {
def sql_path = "../../db/migrations"
def ticket = askTicket()


def timestamp = new Date().format('yyyyMMddHHmmss', TimeZone.getTimeZone('GMT'))


def newMigrationFile = "${sql_path}/${timestamp}-${ticket}.sql"
println ">> ${newMigrationFile}"


File configFile = file(new File("${newMigrationFile}"))
configFile.write("-- migration for '${ticket}' at '${timestamp}'")
    }
}


     3.  Have a CI build just for running DBDeploy migrations.
It is very useful to have CI build just for running db migrations against a separate CI database. Having a separate build helps because, typically db migrations run very quickly and do not depend on any other build pipeline.
On large monolithic projects, you typically tend to have long running build, which make it harder to get quick feedback (within seconds) of committing wrong migration.


     4.  Add Validations for migrations.
Often, it's not enough to have conventions and expect people to follow it. It helps when we have build that breaks when conventions are not followed. Simplest one to violate is the naming of scripts. There is no guarantee that every developer will use the build script you have to generate correctly named scripts. There will be scripts which have wrong timestamp value.
With timestamps, it's important to have in the form yyyyMMddHhmmss. One more or less digit and the order of execution will change.
There are other useful validations like DDL and DMLs should be in a separate file. There should generally be no ‘drop table’ statement.
We modified DBDeploy to allow adding these kind of validations before executing migrations. It was extremely helpful, to have these validations break the CI build.


Intercepting Filter in DBDeploy



In DBDeploy we added a generic intercepting filter which executes before applying all the new migrations.  This allowed us to plugin required functionality in DBDeploy. Executing validations or generating report from all the new migrations was easy to add then. Here is how you can specify intercepting filter in DBDeploy now.


def dbDeploy = new com.dbdeploy.DbDeploy(driver: "${driver}",
url: "${jdbcUrl}",
userid: "${jdbcUsername}",
password: "${jdbcPassword}",
scriptdirectory: new File("${migrations}"),
dbms: "ora",
changeScriptFilter: changeScriptValidatingFilfer,
undoOutputfile: new File("${migrationsPath}/undo_last_change.sql"))


dbDeploy.go()


ChangeScriptFilter which has a generic interface
public interface ChangeScriptFilter {
 void process(List<ChangeScript> changeScripts);
}


You can then implement ChangeScriptFilter as part of your build. Gradle makes it really easy to write groovy script to implement this filter and add this to part of the build. Adding Validations is just one usage of this filter.

public class ChangeScriptValidatingFilfer  implements ChangeScriptFilter {
public void process(List<ChangeScript> scripts) {
for(ChangeScript script:scripts) {
applyValidations(script);
}
}


Validations can be written something like this


public class DMLAndDDLSeparationValidator implements ChangeScriptValidator {
public boolean validate(com.dbdeploy.scripts.ChangeScript changeScript) {
String content = changeScript.getContent().toLowerCase();
if (hasDDL(content) && hasDML(content)) {
throw new com.dbdeploy.exceptions.ChangeScriptValidationFailedException(changeScript + " has create or alter table and insert/update. Please put DML and DDL in separate scripts");
}
return true;
}


private boolean hasDML(String content) {
return content.contains("insert into ") || content.contains("update ");
}


private boolean hasDDL(String content) {
return content.contains("create table ") || content.contains("alter table ");
}
}

    5. Have a dryRun task which DBAs can run before executing dbdeploy against production database.
It helped having a dryRun task which shows all the changes which will be applied when migrations are run. DBDeploy already has a feature which allows you have to specify ‘outputFile’ and then it consolidates all the migrations to be applied to given schema in that file.
It helped us to write some scripts to analyse newly applied migrations and generate a report to show what all tables are affected and if there are inserts or alters or updates in those tables.
Some of the tables are sensitive and DBAs like to if the current deployment might affect those tables in any way.
DryRun task was also added with a custom intercepting filter to analyse all the newly executing scripts. Here is a snapshot of script analyser.

class ChangeScriptAnalyzer implements ChangeScriptFilter {
def inserts = [:]
def updates = [:]
def creates = [:]
def alters = [:]


void process(List<ChangeScript> scripts) {
scripts.each { script ->
def tableAnalyzer = new util.TableAnalyzer(sqls: script.content.readLines())


putInMap(inserts, tableAnalyzer.getTablesWithInserts(), script)
putInMap(updates, tableAnalyzer.getUpdatedTables(), script)
putInMap(creates, tableAnalyzer.getNewlyCreatedTables(), script)
putInMap(alters, tableAnalyzer.getAlteredTables(), script)
}
}


Which then could be used to print reports as following.


Data is inserted in following tables (38)
JOB_MASTER
20151130061549-JIRA-122072.sql
20151216233622-JIRA-120792.sql
20160112210928-JIRA-120792.sql

JOB_CATEGORY
20151130061549-JIRA-122072.sql
20151216233622-JIRA-120792.sql

CONFIGURATIONS_MASTER
20151209080100-JIRA-126407.sql
20151209165500-JIRA-125508.sql
20151213045000-JIRA-122529.sql

Data is updated in following tables (21)
CONFIGURATION_MASTER
20151209165500-JIRA-125508.sql

WEBSERVICE_CFG
20151209165500-JIRA-125508.sql
20160105202700-JIRA-60012.sql
Following tables are newly created (13)

CUSTOMER_ADDRESS
20151210070136-JIRA-127509.sql


Following tables are altered (6)


CUSTOMER
20151209091755-JIRA-124737.sql
20151210070136-JIRA-127509.sql


CONTENT_MASTER
20151209180600-JIRA-126977.sql
20151215183001-JIRA-127777.sql


This helped the DBAs to identify if there is any sensitive table which is modified by current deployment and to have a closer look at migration which is modifying that table.


6. Stored procedures are not Migrations.
Do not put stored procedures in migrations. Stored procedures are compiled in database and all the procedures can be applied for every deployment. Adding a migration for every change in stored procedure, makes it extremely hard to track version history of that procedure.
So manage stored procedures as database source outside of migrations. Apply all procedures to database for every deployment. Have a the deployment script fail if there are compilation errors in stored procedures.


7.  Get consolidated list of all exceptions in the end.
This one is controversial, but it greatly helped us in initial days.
DBDeploy fails on the first sql exception it gets while executing migrations.  When you have globally distributed teams, managing their own databases, and especially with developers not used to automation and dbdeploy, they tend to manually execute migration scripts to fix issues.
This creates problem for deployments then, because dbdeploy execution fails with errors like Unique Constraint or Table already exists etc..
Imagine a deployment which is executing 100 migrations, and 10 of those migrations fail. If dbDeploy fails for each migration, its extremely painful to fix issue, and rerun dbdeploy.
Instead, if you get all the errors (of type table already exists etc…) at the end of run, it's lot easier to go through all the errors at once and then either make manual entries in changelog table or reach out to developers who wrote those migrations to understand why they are failing.
This is not something that you will need in a typical project and might not be the right thing to do as well. But this helped us a lot.  For getting all the errors in the end for specific error codes, we could specify it as following

def dbDeploy = new com.dbdeploy.DbDeploy(driver: "${driver}",
url: "${jdbcUrl}",
userid: "${jdbcUsername}",
password: "${jdbcPassword}",
scriptdirectory: new File("${migrationPath}"),
dbms: "ora",
changeScriptFilter: validatorProvider,
exceptionsToContinueExecutionOn: "ORA-00001,ORA-00955,ORA-01430,ORA-00957,ORA-01430",
undoOutputfile: new File("${migrationsPath}/undo_last_change.sql"))


dbDeploy.go()


This helped a lot in initial phase of roll out, when people were still running manual scripts and DBDeploy will give all the unique constraint or table already exists, kind errors at once in the end, instead of breaking for each error.

The code with all the changes mentioned above is available at https://github.com/unmeshjoshi/dbdeploy

No comments:

Post a Comment