Seamlessly Migrate Digital User Identities: A Step-by-Step Guide

Hishaam Abdulaziz

Lead Engineer - Identity

November 4, 2024

Hishaam Abdulaziz

Lead Engineer - Identity

Ensuring Secure and Efficient Identity Migration

Intro

The prospect of an Identity Migration may appear intimidating, but with the right plan, it doesn’t have to be. Imagine you’re spearheading a digital transformation at your company, and among various strategic initiatives, a key decision emerges: migrating user identities to a new database or directory. This might be prompted by deploying a new Identity and Access Management (IAM) system, switching to a different vendor, or transitioning from an on-premise setup to a private or public cloud. While this transformation involves orchestrating people, processes, and technology, let’s focus on a critical component: Ensuring a smooth migration of user identities to the new platform.

This blog won’t focus on the ‘why’ behind the migration, but rather the ‘how.’ Having gone through this process several times, I’ll walk you through a general guide on the key steps and potential pitfalls to watch out for. We’ll cover the following areas: Planning, Data, Automation, Dry Run, Production Migration, and Verify.

Case Study: Australian Bank Transforms Digital Identity Management, Migrating 60,000 Users to the Cloud

The bank, relying on an outdated in-house authentication system for more than 60,000 mortgage brokers across its lending applications, required a secure and scalable solution. As part of the Versent team, I led the migration of their digital identities and passwords from a legacy SQL-based system to a high-performance, cloud-based PingDirectory and authentication service. This migration not only improved security by upgrading to a more robust password hashing algorithm, but it also introduced multi-factor authentication (MFA) using passkeys, enhancing protection for users. Furthermore, we enabled self-service options, empowering users to manage and reset their credentials independently. These improvements resulted in a smoother user experience, stronger security posture, and reduced administrative workload for the bank.

Planning

Yes, it can be painful, but don’t worry — we’ve got you covered! Planning your migration carefully is critical. Poor execution can disrupt your entire business, especially if your operations rely on digital systems. While you can choose a planning methodology that best fits your organization, your migration plan should include these key milestones at a high level:

Assembling a team with the necessary expertise
Analyzing the source and destination data formats
Automating data transformation processes
Running a dry run of the automation
Preparing rollback strategies
Initiating the production migration process

For each milestone, carefully identify the steps, dependencies, and potential obstacles, particularly for tasks on the critical path. To ensure a smooth and successful migration, prioritize full automation using appropriate tools and scripts wherever possible.

Data

Source and Destination

Let’s dive in. You could be handling user identity migration from a source to a destination database or directories in one of the following:

From or to SQL-type databases
From or to LDAP-type directories
A combination of the above.

SQL (Structured Query Language) manages relational databases, which store data in tables with rows and columns. Common examples include Microsoft SQL Server, Oracle Database, and PostgreSQL.

LDAP (Lightweight Directory Access Protocol) is used to access directory services, which store data in hierarchical structures. LDAP is often used to organize and manage users, groups, and resources within organizations. Examples include Microsoft Active Directory, OpenLDAP, and Oracle Directory.

In the real world, your migration might seem slightly different, it could involve moving from an on-premise Microsoft Active Directory to Entra ID (formerly Azure Active Directory) in the cloud, or from an Oracle Directory to a Ping Directory, among other vendor and database combinations.

Modern Identity and Access Management (IAM) systems support SQL- and LDAP-based databases. However, LDAP is better suited for storing user data than SQL because it’s optimized for read-heavy operations, such as user authentication, and uses a hierarchical structure ideal for organizing identities. It supports identity-specific schemas and identity protocols like SSO and OAuth, making it efficient for managing large user bases. While SQL is more flexible for transactional or relational data, LDAP excels in performance, scalability, and integration for identity management.

Data Format

CSV files: If you are dealing with SQL, the data extracted will most likely be a CSV (comma-separated values).

user_id, given_name, family_name, email
U0001, John, Doe, john@example.com
U0002, Jane, Smith, jane@example.com

LDIF files: If you are dealing with LDAP, the extracted data will most likely be an LDIF (a text file of extracted data easily understood by LDAP-based databases).

dn: uid=jdoe,ou=users,dc=example,dc=com
objectClass: inetOrgPerson
objectClass: organizationalPerson
objectClass: person
objectClass: top
uid: U0001
cn: John Doe
sn: Doe
givenName: John
mail: john@example.com
telephoneNumber: +1 555 123 4567
userPassword: {SSHA}oT7+5Mn9tO+8LJk1nFyH3v+aQsjDk5Kn
ou: users
title: Senior Developer

API: Another way could be if your source and destination support REST API or SCIM. Using these, you could devise a strategy to extract, transform, and upload data.

//Sample SCIM API

POST /scim/v2/Users
Content-Type: application/json

{
  "schemas": [ "urn:ietf:params:scim:schemas:core:2.0:User" ],
  "userName": "U0001",
  "name": {
    "givenName": "John",
    "familyName": "Doe"
  },
  "email": "john@example.com",
  "mobile": [
    {
      "value": "+1 555 123 4567",
      "type": "work"
    }
  ],
  "title": "Senior Developer",
  "active": true,
  "password": "{SSHA}oT7+5Mn9tO+8LJk1nFyH3v+aQsjDk5Kn",
}

A REST API (Representational State Transfer Application Programming Interface) is a set of rules that allows applications to communicate with each other over HTTP. It is widely used today to enable interaction between different systems and applications.

SCIM (System for Cross-domain Identity Management) is an industry-standard protocol for automating the exchange of user identity information across domains and applications. It simplifies CRUD (Create, Read, Update, Delete) operations on user objects, making it easier to manage identities across various systems in a standardized and scalable way.

Hence, your data transformation script would be converted to and from from one of the above formats. Other tools and strategies you might consider for data migration include:

Vendor-Specific Tools: If you’re migrating user identities from an on-premise Active Directory to Entra ID (formerly Azure Active Directory) in the cloud, the most common tool would be Microsoft’s Azure AD Connect, which synchronizes on-premises directories with Azure AD.
Just-in-Time Migration: In this approach, you run the old user directory or database in parallel with the new system. Each time a user logs in, their account is automatically created in the new system. Once you’re confident that most or all users have logged in and their accounts are successfully migrated, you can safely decommission the old directory or database.

Data Cleanup

“Garbage in, garbage out!” Data migration provides the perfect opportunity to clean up your production data. Carrying over data issues to your new platform defeats the purpose of migration. Here are some key steps to ensure your data is clean and ready:

Ensure Unique Identifiers: Verify that each user entry has a unique ID and that there are no duplicate records.
Standardize Data Formats: Ensure all data attributes follow the required format. For example, fields like “age” should be in integer form, email addresses should follow the standard format (e.g., abc@company.com), and all mandatory fields should be filled in.
Validate Key Attributes for Authentication: If certain attributes are critical to your authentication policy (e.g., mobile numbers for SMS OTP, office locations for printer access), give these attributes extra attention to ensure accuracy.
Consistency in “Active/Inactive” Flags: Ensure that all “Active/Inactive” status flags, particularly those used in authentication policies, are consistent in format and case.

Data transformation

User Attributes migration: A digital identity is associated with several attributes. From your schema review, you need to finalize which attributes you plan to carry forward, which ones you plan to skip, and which ones need to be transformed. All of this can then form the basis of your automation or input to the tool you plan to use.

User Group migration: You may want to carry forward groups and their user associations from the source to the target, which may be useful in applying authentication or authorization policies. If this is the case, you need to prepare additional data extracts of:

Group names and their attributes

// A CSV file illustration of groups and their attributes
GroupID,GroupName,Description,Role
G001,Admins,Administrators group with full access,Admin
G002,Developers,Developers with access to development environments,Developer

Group and user associations

// A CSV file illustration of group and user associations
UserID,GroupID
U101,G001
U102,G002
U103,G002

Password migration: If you decide to carry forward users’ passwords, there are a few things to consider. Passwords are usually stored as hashes, which is a one-way transformation of clear text to hash value. In migration, ensure the destination database supports the hashing algorithm used in the source; otherwise, we can’t migrate passwords. Some popular password hashing algorithms are bcrypt, argon2, PBKDF2, and SHA256.

Password clear text: 
Password01!

Password hash value using bcrypt algorithm: 
$2b$12$DpwuwOHXFvMPPNV0fPl75.HiPHkLMQfzzs94mVTA2VebMcWw3C0nK

Then there is also the “salting” of hashes. Salting is appending a random value to a password so that the resulting hash is different for the same password. This is to protect it from rainbow attacks and brute-force attacks. During migration, remember to migrate the password’s salt values, too.

Before Salting:

User A Password: password123 → Hash: e99a18c428cb38d5f260853678922e03
User B Password: password123 → Hash: e99a18c428cb38d5f260853678922e03

After Salting:

User A Password: password123 + Salt: abc123 → Hash: 5a67dffae1d5e8d92b
User B Password: password123 + Salt: xyz789 → Hash: abef345c678901239

Remember password policies can be different in source and target if we are migrating the hashes, as the new system has no way of validating a password hash value to determine if it meets the new password policy. If you are trying to move from a weaker to a stronger password policy then you could have a strategy such as forcing users to reset their password within 30 days post-migration, which will ensure all users’ passwords are in line with a stronger policy.

Skipping password migration: If we decide not to migrate passwords, then after the user migration, we will need to prompt the user population to set new passwords for login. The most efficient way to handle this is by implementing an effective “self-service password reset” flow in the new system. The password reset process should leverage a known user attribute that others find difficult to guess or access. For example, use a mobile number for SMS OTP, email OTP, an existing MFA method (if available), or another secure password reset mechanism.

Without such a system, your Helpdesk team will likely be overwhelmed with password reset requests in the first few weeks, which could lead to operational bottlenecks and frustration. Additionally, it’s essential to plan system resources to handle the high volume of password resets anticipated in the initial days following the release of the new system.

Indexing: A frequently overlooked aspect of setting up a target identity system is indexing — a critical factor for optimizing performance. Indexing acts as a caching mechanism that significantly enhances READ operations, leading to faster response times, especially during user login. It’s essential to carefully analyze your indexing requirements, ensuring that any relevant indexes from your old system are carried over or adjusted for the new environment. Additionally, consider creating new indexes tailored to the specific demands of your new system to boost performance further.

// App Assumption:
// User entries in the database or directory: 5,000,000
// Number of user login per day: 20,000
// Number of user login during peak hour: 5,000
// Number of user login during the peak second of the day (TPS): 500

// Sample User login duration WITHOUT indexing the "username" attribute

2000 milliseconds

// Sample User login duration WITH indexing the "username" attribute

100 milliseconds

Studies have shown that a prolonged login process results in a poor user experience (UX), which can drive users to switch to a competitor’s app.

Automation

Automate! Automate! Automate!

The best way to carry forward your migration is using automation. The automation script needs to be developed and tested in a dry run a few times. Let’s look at a few of the automation considerations.

Tools and people selection

It’s crucial to plan for the tools or scripts you’ll need and ensure you have the right people with the necessary skills to implement them. Typically, you’ll be working with common scripting languages like Python, Shell, or PowerShell, depending on the data type you’re handling. Alternatively, your target database or solution may offer built-in tools to import data from legacy systems. However, it is recommended to use customized automation as it allows for flexibility in meeting your business’s specific data and process requirements.

Server selection

Identify the server you are going to run the data transformation script and stick to the same server type (like a specific distribution of Linux or Windows version) in non-prod and prod. This server does not need to be the server that hosts the user’s database; it can be a separate server (such as a jump host) with a route to the destination and source database. This host might be dealing with PII (Personally Identifiable Information) data, hence make sure the security policies are configured in a way to meet your audit obligations.

AID properties

The automation script should be developed with AID (atomicity, isolation, and durability) properties in mind. Atomicity: the script should either write all or nothing into the target database or directory. Isolation: the writing of one user entry should not interfere with the writing of another user entry. Durability: once the entry is committed the entry should be permanent even in the event of system failure.

Data format

The script/tool should be capable enough to perform a data entry validity check before attempting to transform it so that it can catch a failed update in advance. This is in addition to the data cleanup activity.

# An illustration of checking the format of attribute values before 
# attempting to upload them to the new database or directory

# Validate email format using regex
email_regex = r'^[\w\.-]+@[\w\.-]+\.\w{2,}$'
email_valid = re.match(email_regex, data["email"]) is not None

# Validate mobile number (10-15 digits, optional '+')
mobile_regex = r'^\+?\d{10,15}$'
mobile_valid = re.match(mobile_regex, data["mobile"]) is not None

# Validate firstName (non-empty and only alphabets)
first_name_valid = data["firstName"].isalpha() and len(data["firstName"]) > 0

# Validate address (non-empty)
address_valid = len(data["address"].strip()) > 0

Data Transfer

Define how to transfer the script, data, and log files to and from the jump host. Ensure routes for transfer are established by opening and adding any relevant firewall rules, along with any required dependencies. If transferring sensitive or PII data, ensure encryption protocols (e.g., TLS for data in transit, encryption at rest) are in place to maintain security.

Fault Tolerance

Ensure the script or tool is fault-tolerant so it doesn’t halt or error out on incorrect entries. Instead, it should flag the entry as a failure and continue processing the next one.

Logs

Configure the script/tool to generate output logs where each line records the transformation status (success/failure) of a user identity, along with the user’s unique ID. This will make tracking progress straightforward and allow for easy manual correction of failed entries. Ensure that failure details are logged clearly, enabling quick identification and resolution of issues during manual review. Additionally, configure the script to log execution timings so you can track the overall runtime and performance of the process.

<Sample script output log>

U0001 Success
U0002 Fail, mobile format is incorrect
U0003 Success
U0004 Success
U0005 Fail, mandatory email is missing
U0006 Success

<script showing the execution time at the end>

Script start time: 10:00:00
Script end time: 17:00:00
Run duration: 7 hours

Dry Run

This process tests and fine-tunes the data transformation before production. Start with test data to adjust for failures and exceptions. If possible, run a dry run with masked production data to catch potential issues early, ensuring it’s done on a secure server to protect PII.

For large datasets (say, millions of user entries), the dry run process can take several hours. Plan for contingencies by monitoring CPU load on the jump host—if utilization is low, add more threads. When multithreading, split the data (e.g., for 300,000 users, run three threads for ranges 1–99,999, 100,000–199,999, and 200,000–300,000). Always run the script as a background process to avoid disruptions if you lose your SSH or RDP session.

<Running a process in the background in UNIX, with the & symbol>

# ./your_script.sh &

<Monitoring server load in UNIX, you would want to get the idle time close 
to 10%, which means the system is being utilized for 90%>

# top
Processes: 731 total, 2 running, 729 sleeping, 3581 threads                                                           16:04:40
Load Avg: 2.83, 3.24, 3.19  CPU usage: 11.16% user, 6.35% sys, 82.47% idle  SharedLibs: 483M resident, 90M data, 43M linkedit.

Production Migration

Plan your production migration well in advance, ideally a month or two before the change. Use insights from your dry run to determine the steps and time required for a successful production data migration. Key considerations for production migration include:

Dev and Ops coordination

The transformation script will most likely be developed by a development team and executed by the operations team, which has access to the production data and environment. Therefore, the developer’s and operations teams’ coordination is essential for knowledge transfer and combined dry-run execution.

Halt in user CUD

During the production change migration, you need to let all the affected applications and administrators know that the user CUD (create, update, and delete) operations must be halted until and after the migration completes so that you don’t throw a spanner into the number tally.

Roll back and start over

No matter how meticulously we plan, something can go wrong. So, always have the data transformation script enabled to roll back the data and be prepared to start over.

Timing

The timing for the data migration exercise will be informed by the dry run exercise performed in a test environment. Some of the things you need to consider:

Did the test dry run contain the same user identities as production? If so, we can get the timing right.
Add additional time for development and operations coordination in prod, which might have yet to happen in non-prod.
Now, multiply that time by at least 2.5 times to allow for repeat runs and manual fixes.

Verify

Numbers

This activity is very much part of the production migration and should be performed within the change window. This is where we determine whether the migration was a Success or a Failure and decide whether we need to roll back.

Validating the numbers is what is going to tell us if the data migration activity is a success. Here are some of the numbers you need to validate and match for a successful migration:

number of source user identities

-- If the source user identities are on SQL

SELECT COUNT(*) 
FROM USER 
WHERE status = 'ACTIVE';

-- Output

100

number of successful user identities from the automation script

# Counts the number of users that have been successfully transformed 
# and uploaded to the target directory by the automation script.
# (Assumes the script is being run in a UNIX environment using bash)

$ cat your_script_output.log | grep "Success" | wc -l

# Output

100

number of user identities from the target database or directory

# Assuming your target is an LDAP directory
# A sample query to count user entries would be

$ ldapsearch -x -LLL -b "dc=example,dc=com" "(objectClass=person)" uid | grep "uid\:" | wc -l

# Output

100

When your numbers don’t match go back and see which users need to be manually fixed in the target database, so have a manual update strategy planned.

Password Validation

If password hashes have been migrated, a successful login to an application using a password from the old system serves as validation that the user passwords have been migrated correctly.

Attributes and Group Validation

One effective way to validate user migration is to successfully log into the application after the migration and conduct a Business Verification Test (BVT). This involves accessing and performing key tasks within the app to ensure that authentication and authorisation function as expected.

A more robust approach to verifying a successful migration is by examining the interaction between the Identity Provider (IdP) and the relying party application. For popular integration patterns like SAML or OAuth, you can compare a SAML assertion and a JWT (JSON Web Token) ID token before and after the migration. This comparison will confirm whether user attributes and groups are being sent as they were before the migration.

//Sample SAML Assertion

    <saml:Subject>
        <saml:NameID Format="urn:oasis:names:tc:SAML:1.1:nameid-format:emailAddress">
            user@example.com
        </saml:NameID>
    </saml:Subject>

...

    <saml:AttributeStatement>
        <saml:Attribute Name="given_name">
            <saml:AttributeValue>John</saml:AttributeValue>
        </saml:Attribute>
        <saml:Attribute Name="family_name">
            <saml:AttributeValue>Doe</saml:AttributeValue>
        </saml:Attribute>
        <saml:Attribute Name="role">
            <saml:AttributeValue>admin</saml:AttributeValue>
        </saml:Attribute>
    </saml:AttributeStatement>
</saml:Assertion>

//Sample JWT ID Token

{
  "iss": "https://authserver.example.com",
  "sub": "1234567890",
  "aud": "client_id_123",
  "exp": 1712345678,
  "iat": 1612345678,
  "email": "user@example.com",
  "email_verified": true,
  "name": "John Doe",
  "picture": "https://example.com/profile.jpg",
  "given_name": "John",
  "family_name": "Doe",
  "locale": "en",
  "role": ["admin", "editor", "viewer"]
}

Remember that the Certificates used in signing or encrypting the tokens may have changed or updated. So you can ignore those accordingly. If the tokens are encrypted, then you need to send the tokens to be decrypted by someone who has ownership of the private keys.

Hypercare

Even after a migration is declared successful, exceptions can cause data issues. Therefore, it’s best practice to establish a 30-day (or any appropriate duration) hypercare period. During this time, the development team should remain on standby, ready to engage with the operations team on short notice—bypassing the usual ticketing system—to address any incidents related to the migration.

Wrap

Each identity migration is unique, with its own set of challenges!

No two migrations are the same because each operates under different variables—often literal code variables. However, what remains constant is the structured approach taken: Planning, Data, Automation, Dry Run, Production Migration, and Verify.

For an effective migration, remember that the tuning done during the dry run informs the steps to take during the production migration. Additionally, always have contingency plans for when things don’t go as expected.

Key considerations we haven’t touched upon

Stakeholder Involvement: Ensure that key stakeholders, including development, security, compliance, and business teams, are engaged at each critical milestone. Their input is essential, and they should be kept updated on progress.
Business Objectives: While we haven’t focused on the “Why” of the migration, it’s crucial to define clear objectives and measure if they’ve been achieved post-migration. Examples include improving security posture, enhancing performance, or meeting scalability goals.
Compliance: Consult your compliance team to ensure that the migration process aligns with relevant regulatory requirements, such as HIPAA, GDPR, or other standards that apply to your organization.
Communication (Comms): Proper communication is critical. App admins should avoid making configuration changes during migration. Identity admins should temporarily suspend user provisioning and de-provisioning operations. End users must be informed, especially if password resets or other actions will be required post-migration.
Rollback and Backups: Always have a well-defined rollback and backup strategy in case the migration doesn’t go as planned. Be prepared for a second attempt if necessary, and ensure data integrity through comprehensive backups.

Good luck with burning the midnight oil 🙂