You are currently viewing Building an ETL Process with PHP: From CSV to MySQL
  • Post category:PHP
  • Post comments:0 Comments
  • Reading time:78 mins read

ETL stands for Extract, Transform, Load — it’s a classic data pipeline approach used in everything from analytics to app integrations.

We’re going to:

  • Read a CSV with 10 fields
  • Transform/clean several of them (date formats, name splits, status mapping, etc.)
  • Insert into a normalized MySQL table using PHP and PDO

Sample Extended CSV File (users_extended.csv)

Plaintext
id,full_name,email,birthdate,signup_date,is_active,country_code,phone_number,role,last_login
1,John Doe,<a href="mailto:john@example.com" rel="noreferrer noopener" target="_blank">john@example.com</a>,1990-03-15,2024-01-01,1,US,+12025550123,admin,2024-06-01 10:23:45
2,Jane Smith,<a href="mailto:jane@example.com" rel="noreferrer noopener" target="_blank">jane@example.com</a>,1985-07-23,2023-12-15,0,GB,+447911123456,editor,2024-06-20 09:15:32
3,Bob Lee,<a href="mailto:bob@example.com" rel="noreferrer noopener" target="_blank">bob@example.com</a>,1999-11-08,2024-03-10,1,IN,+919876543210,viewer,2024-07-01 14:02:00

MySQL Table Schema

SQL
CREATE TABLE users (
    id INT PRIMARY KEY,
    first_name VARCHAR(50),
    last_name VARCHAR(50),
    email VARCHAR(100),
    age INT,
    signup_date DATE,
    is_active BOOLEAN,
    country VARCHAR(2),
    phone_number VARCHAR(20),
    role ENUM('admin','editor','viewer'),
    last_login DATETIME
);

Transformations to Perform

FieldAction
full_nameSplit into first_name and last_name
birthdateCalculate age
is_activeConvert 1/0 to boolean
country_codeKeep as-is (ISO 2-letter)
phone_numberFormat to international standard (strip non-digits)
roleNormalize value (e.g., lowercase only)
last_loginConvert to MySQL DATETIME format

PHP ETL Script

PHP
<?php
$pdo = new PDO("mysql:host=localhost;dbname=etl_demo", "root", "");

// Open CSV
if (($handle = fopen("users_extended.csv", "r")) !== FALSE) {
    $header = fgetcsv($handle); // Read headers

    while (($row = fgetcsv($handle)) !== FALSE) {
        $data = array_combine($header, $row);

        // Split name
        [$firstName, $lastName] = explode(" ", $data['full_name'], 2);

        // Calculate age
        $birthDate = new DateTime($data['birthdate']);
        $today = new DateTime();
        $age = $today->diff($birthDate)->y;

        // Normalize values
        $isActive = $data['is_active'] == "1" ? 1 : 0;
        $country = strtoupper($data['country_code']);
        $role = strtolower(trim($data['role']));
        $phone = preg_replace('/\D+/', '', $data['phone_number']);
        $lastLogin = date("Y-m-d H:i:s", strtotime($data['last_login']));

        // Insert
        $stmt = $pdo->prepare("
            INSERT INTO users (id, first_name, last_name, email, age, signup_date, is_active, country, phone_number, role, last_login)
            VALUES (:id, :first_name, :last_name, :email, :age, :signup_date, :is_active, :country, :phone_number, :role, :last_login)
        ");

        $stmt->execute([
            ':id' => $data['id'],
            ':first_name' => $firstName,
            ':last_name' => $lastName ?? '',
            ':email' => $data['email'],
            ':age' => $age,
            ':signup_date' => $data['signup_date'],
            ':is_active' => $isActive,
            ':country' => $country,
            ':phone_number' => $phone,
            ':role' => $role,
            ':last_login' => $lastLogin
        ]);
    }

    fclose($handle);
    echo "Extended ETL process completed.";
}
?>

Result Example

After running, your MySQL users table will look like:

idfirst_namelast_nameemailageis_activecountryphone_numberrolelast_login
1JohnDoejohn@example.com341US12025550123admin2024-06-01 10:23:45
2JaneSmithjane@example.com390GB447911123456editor2024-06-20 09:15:32
3BobLeebob@example.com251IN919876543210viewer2024-07-01 14:02:00


🎉 Wrap Up

You now have a powerful ETL flow using PHP and MySQL that can:

  • Work with real-world data formats
  • Perform multiple field transformations
  • Be expanded for automation or API integrations

Please follow and like us:

Leave a Reply