Mambu Extract

19 Dec 2024
3 Minutes To Read

Print
Share
Dark
Light
PDF

Mambu Extract

Updated On 19 Dec 2024
3 Minutes To Read

Print
Share
Dark
Light
PDF

Article summary

Mambu Extract is a real-time data replication solution that efficiently syncs data between Mambu and customer systems, focusing on high throughput and reliability. It surpasses traditional data extraction methods by incrementally syncing only the latest data, ensuring the target data store closely mirrors live data from the Mambu banking platform. Built on AWS Data Migration Service, Mambu Extract supports ongoing replication and schema conversion while offering a serverless option to minimize operational costs. Data is sourced from the customer’s RDS MySQL Master database and replicated to an AWS S3 bucket, allowing customers full control over their data for further integrations. The system is designed for high availability, automatically switching to alternate availability zones in case of failures. Mambu recommends extracting specific tables to reduce data size and exposure, ensuring efficient data management.

Did you find this summary helpful?

Thank you for your feedback

Early Access Feature

If you would like to request early access to this feature, please get in touch with your Mambu Customer Success Manager to discuss your requirements. For more information, see Mambu Release Cycle - Feature Release Status.

Mambu Extract is a near real-time data replication solution between Mambu and the customer that is designed with high throughput and reliability. Mambu Extract outperforms other methods of data extraction, such as database backups via API, the Stitch ETL connector, and streaming data, by incrementally syncing only the latest data to the customer data store. These syncs happen on a regular cadence to ensure that the target data store is as close as possible to the live data on the Mambu banking platform.

NOTE

Mambu Extract is currently only available for customers with dedicated environments on AWS.

Architecture

Mambu Extract’s architecture is based on AWS Data Migration Service (DMS Serverless), which over the last year evolved to support ongoing replication or change data capture (CDC), and database schema conversion. It offers a serverless option, reducing operational overhead and costs while natively providing high availability.

Mambu Extract uses the customer production RDS MySQL Master database as its source, and replicates data to an AWS S3 bucket on the customer side as the target. This setup allows the customer to have full control over the data and further downstream integrations - for example, the Data Warehouse or Data Lake - without any dependencies with Mambu.

The architectural flow is described in the following diagram:

Mambu Extract architecture

Customer sends request to Mambu Core Banking Engine APIs.
Core Banking Engine commits the change to the database (RDS MySQL).
AWS Data Migration Service captures the change in the database, transforms to Parquet file and uploads to customer S3 bucket.
Customer downloads the Parquet file.

File and S3 bucket configurations

S3 bucket folder structure

S3 object storage is used as a target for the data extraction, and must be configured in the customer’s AWS account. The folder structure in the bucket will be pre-created automatically by Mambu Extract & AWS DMS service as follows:

<bucket name>
    mambu-extract-data
        <tenant-name-one>
            <table-name-one>
            …
        <table-name-n>
        …
        <tenant-name-n>

S3 bucket structure example - initial load

Table name: “sequence table”
File: LOAD00000001.parquet → This file is created during initial data load.
Folder: 2024/ → This folder is created during CDC (Change Data Capture).

Initial load S3 bucket structure

S3 bucket structure example - Change Data Capture

Table name: “sequence table”
Folder: 2024 (automatically created by DMS, represents “YEAR” when file was extracted)
Folder: 08 (automatically created by DMS, represents “MONTH” when file was extracted)
Folder: 01 (automatically created by DMS, represents “DAY” when file was extracted)

S3 bucket structure CDC

Parquet files

Parquet files use the following configuration:

{
    "CsvRowDelimiter": "\n",
    "CsvDelimiter": ",",
    "CompressionType": "NONE",
    "DataFormat": "parquet",
    "ParquetVersion": "parquet-2-0",
    "TimestampColumnName": "timeMigration"
}

Operations

Data upload

Upload frequency is configurable per environment. By default, files during the full load phase (initial full database replication) will be roughly 650MB for each Parquet file.

During the CDC phase, the minimum Parquet file size is set to 32MB and 1 minute intervals, meaning when the CDC is ongoing transactions are batched together (cached) and Parquet file writing is triggered by whichever parameter condition (either 32MB file size or 1 minute interval) is met first.

Transactions Parquet files will be under the specific table folder and under the date when the row changes happened.

High availability and disaster recovery

To ensure that the Mambu Extract service remains highly available, we configure the AWS DMS service with a Multi-AZ configuration. In case of a single availability zone failure, the DMS service will be automatically switched to the next available Availability Zone.

AWS availability configuration

Customer requirements

Customers must configure their S3 bucket for the production (primary) region as well as the disaster recovery region. S3 is a regional AWS service, so buckets per region must be created.

Mambu table list

Mambu Extract allows you to extract all tables. However to minimize the size of transferred data and exposure of raw data, Mambu recommends that customers use the following standard list of tables:

Table names:

loanaccount
loantransaction
gljournalentry
customfield
customfieldvalue
client
customfieldset
savingsaccount
savingstransaction
loanproduct
repayment
transactiondetails
transactionchannel
activity
lineofcredit

For more details about data structures, please refer to the following page: Data Dictionary.

Was this article helpful?

What's Next

Streaming API

On this page

Architecture
File and S3 bucket configurations
Operations
High availability and disaster recovery
Mambu table list