RosettaDB Commands
Rosetta Commands
Available commands
- init
- extract
- compile
- dbt
- diff
- test
- apply
- generate
init
This command will generate a project (directory) if specified, a default configuration file located in the current directory with example connections for bigquery
and snowflake
, and the model directory.
rosetta init [PROJECT_NAME]
Parameter | Description |
---|---|
(Optional) PROJECT_NAME | Project name (directory) where the configuration file and model directory will be created. |
Example:
#example with 2 connections
connections:
- name: snowflake_weather_prod
databaseName: SNOWFLAKE_SAMPLE_DATA
schemaName: WEATHER
dbType: snowflake
url: jdbc:snowflake://<account_identifier>.snowflakecomputing.com/?<connection_params>
userName: bob
password: bobPassword
- name: bigquery_prod
databaseName: bigquery-public-data
schemaName: breathe
dbType: bigquery
url: jdbc:bigquery://[Host]:[Port];ProjectId=[Project];OAuthType= [AuthValue];[Property1]=[Value1];[Property2]=[Value2];...
userName: user
password: password
tables:
- bigquery_table
extract
This is the command that extracts the schema from a database and generates declarative DBML models that can be used for conversion to alternate database targets.
rosetta [-c, --config CONFIG_FILE] extract [-h, --help] [-s, --source CONNECTION_NAME] [-t, --convert-to CONNECTION_NAME]
Parameter | Description |
---|---|
-h, --help | Show the help message and exit. |
-c, --config CONFIG_FILE | YAML config file. If none is supplied it will use main.conf in the current directory if it exists. |
-s, --source CONNECTION_NAME | The source connection name to extract schema from. |
-t, --convert-to CONNECTION_NAME (Optional) | The target connection name in which source DBML converts to. |
Example:
---
safeMode: false
databaseType: bigquery
operationLevel: database
tables:
- name: "profiles"
type: "TABLE"
schema: "breathe"
columns:
- name: "id"
typeName: "INT64"
jdbcDataType: "4"
ordinalPosition: 0
primaryKeySequenceId: 1
columnDisplaySize: 10
scale: 0
precision: 10
primaryKey: false
nullable: false
autoincrement: true
- name: "name"
typeName: "STRING"
jdbcDataType: "12"
ordinalPosition: 0
primaryKeySequenceId: 0
columnDisplaySize: 255
scale: 0
precision: 255
primaryKey: false
nullable: false
autoincrement: false
compile
This command generates a DDL for a target database based on the source DBML which was generated by the previous command (extract
).
rosetta [-c, --config CONFIG_FILE] compile [-h, --help] [-t, --target CONNECTION_NAME] [-s, --source CONNECTION_NAME]
Parameter | Description |
---|---|
-h, --help | Show the help message and exit. |
-c, --config CONFIG_FILE | YAML config file. If none is supplied it will use main.conf in the current directory if it exists. |
-s, --source CONNECTION_NAME (Optional) | The source connection name where models are generated. |
-t, --target CONNECTION_NAME | The target connection name in which source DBML converts to. |
-d, --with-drop | Add query to drop tables when generating ddl. |
Example:
CREATE SCHEMA breathe;
CREATE TABLE breathe.profiles(id INTEGER not null AUTO_INCREMENT, name STRING not null);
dbt
This is the command that generates dbt models for a source DBML which was generated by the previous command (extract
).
rosetta [-c, --config CONFIG_FILE] dbt [-h, --help] [-s, --source CONNECTION_NAME]
Parameter | Description |
---|---|
-h, --help | Show the help message and exit. |
-c, --config CONFIG_FILE | YAML config file. If none is supplied it will use main.conf in the current directory if it exists. |
-s, --source CONNECTION_NAME | The source connection name where models are generated. |
diff
Show the difference between the local model and the database. Check if any table is removed, or added or if any columns have changed.
rosetta [-c, --config CONFIG_FILE] diff [-h, --help] [-s, --source CONNECTION_NAME]
Parameter | Description |
---|---|
-h, --help | Show the help message and exit. |
-c, --config CONFIG_FILE | YAML config file. If none is supplied it will use main.conf in the current directory if it exists. |
-s, --source CONNECTION_NAME | The source connection is used to specify which models and connection to use. |
-m, --model MODEL_FILE (Optional) | The model file to use for apply. Default is model.yaml |
Example:
There are changes between local model and targeted source
Table Changed: Table 'actor' columns changed
Column Changed: Column 'actor_id' in table 'actor' changed 'Precision'. New value: '1', old value: '5'
Column Changed: Column 'actor_id' in table 'actor' changed 'Autoincrement'. New value: 'true', old value: 'false'
Column Changed: Column 'actor_id' in table 'actor' changed 'Primary key'. New value: 'false', old value: 'true'
Column Changed: Column 'actor_id' in table 'actor' changed 'Nullable'. New value: 'true', old value: 'false'
Table Added: Table 'address'
test
This command runs tests for columns using assertions. Then they are translated into query commands, executed, and compared with an expected value. Currently supported assertions are: equals(=), not equals(!=), less than(<), more than(>), less than or equals(<=), more than or equals(>=), contains(in), is null, is not null, like, between
. Examples are shown below:
rosetta [-c, --config CONFIG_FILE] test [-h, --help] [-s, --source CONNECTION_NAME]
Parameter | Description |
---|---|
-h, --help | Show the help message and exit. |
-c, --config CONFIG_FILE | YAML config file. If none is supplied it will use main.conf in the current directory if it exists. |
-s, --source CONNECTION_NAME | The source connection is used to specify which models and connections to use. |
Note: Value for BigQuery Array columns should be comma separated value ('a,b,c,d,e').
Example:
---
safeMode: false
databaseType: "mysql"
operationLevel: database
tables:
- name: "actor"
type: "TABLE"
columns:
- name: "actor_id"
typeName: "SMALLINT UNSIGNED"
ordinalPosition: 0
primaryKeySequenceId: 1
columnDisplaySize: 5
scale: 0
precision: 5
nullable: false
primaryKey: true
autoincrement: false
tests:
assertion:
- operator: '='
value: 16
expected: 1
- name: "first_name"
typeName: "VARCHAR"
ordinalPosition: 0
primaryKeySequenceId: 0
columnDisplaySize: 45
scale: 0
precision: 45
nullable: false
primaryKey: false
autoincrement: false
tests:
assertion:
- operator: '!='
value: 'Michael'
expected: 1
Output example:
Running tests for mysql. Found: 2
1 of 2, RUNNING test ('=') on column: 'actor_id'
1 of 2, FINISHED test on column: 'actor_id' (expected: '1' - actual: '1') ......................... [PASS in 0.288s]
2 of 2, RUNNING test ('!=') on column: 'first_name'
2 of 2, FINISHED test on column: 'first_name' (expected: '1' - actual: '219') ..................... [FAIL in 0.091s]
apply
Gets current model and compares with state of database, generates ddl for changes and applies to database.
rosetta [-c, --config CONFIG_FILE] apply [-h, --help] [-s, --source CONNECTION_NAME]
Parameter | Description |
---|---|
-h, --help | Show the help message and exit. |
-c, --config CONFIG_FILE | YAML config file. If none is supplied it will use main.conf in the current directory if it exists. |
-s, --source CONNECTION_NAME | The source connection is used to specify which models and connection to use. |
-m, --model MODEL_FILE (Optional) | The model file to use for apply. Default is model.yaml |
Example:
(Actual database)
---
safeMode: false
databaseType: "mysql"
operationLevel: database
tables:
- name: "actor"
type: "TABLE"
columns:
- name: "actor_id"
typeName: "SMALLINT UNSIGNED"
ordinalPosition: 0
primaryKeySequenceId: 1
columnDisplaySize: 5
scale: 0
precision: 5
nullable: false
primaryKey: true
autoincrement: false
tests:
assertion:
- operator: '='
value: 16
expected: 1
(Expected database)
---
safeMode: false
databaseType: "mysql"
operationLevel: database
tables:
- name: "actor"
type: "TABLE"
columns:
- name: "actor_id"
typeName: "SMALLINT UNSIGNED"
ordinalPosition: 0
primaryKeySequenceId: 1
columnDisplaySize: 5
scale: 0
precision: 5
nullable: false
primaryKey: true
autoincrement: false
tests:
assertion:
- operator: '='
value: 16
expected: 1
- name: "first_name"
typeName: "VARCHAR"
ordinalPosition: 0
primaryKeySequenceId: 0
columnDisplaySize: 45
scale: 0
precision: 45
nullable: false
primaryKey: false
autoincrement: false
tests:
assertion:
- operator: '!='
value: 'Michael'
expected: 1
Description: Our actual database does not contain first_name
so we expect it to alter the table and add the column, inside the source directory there will be the executed DDL and a snapshot of the current database.
generate
This command will generate Spark Python (file) or Spark Scala (file), firstly it extracts a schema from a source database and gets connection properties from the source connection, then it creates a python (file) or scala (file) that translates schemas, which is ready to transfer data from source to target.
rosetta [-c, --config CONFIG_FILE] generate [-h, --help] [-s, --source CONNECTION_NAME] [-t, --target CONNECTION_NAME] [--pyspark] [--scala]
Parameter | Description |
---|---|
-h, --help | Show the help message and exit. |
-c, --config CONFIG_FILE | YAML config file. If none is supplied it will use main.conf in the current directory if it exists. |
-s, --source CONNECTION_NAME | The source connection name to extract schema from. |
-t, --target CONNECTION_NAME | The target connection name where the data will be transfered. |
--pyspark | Generates the Spark SQL file. |
--scala | Generates the Scala SQL file. |