Md. Mehadi Hasan

Senior Software Engineer

Professional Summary

  • ~5 years driving data-backend development, building scalable data pipelines & systems.
  • Proficient in Python-SQL-Elasticsearch (~5Y); AWS (~5Y); Scrapy-Kafka (~3Y), ML (~2Y).
  • Proven track record in optimizing large-scale ETL workflows and reducing infra costs.
  • Hands-on experience in delivering AI-driven data solutions since 2023.
  • WORK EXPERIENCE

    03/25 - Pres.
    • Architected a core company data (financial reports, geographical location, corporate structure, industrial categories, URLs, situation, persons, etc.) & metadata indexing pipeline processing 2M+ company records daily, improving data retrieval speed by 28.6% and reducing query response time from 800ms to 570ms. This is 60% of the total system’s data (222M+).
    • Implemented automated designation categorization system using GPT-4 API, reducing manual categorization effort by 40% and processing 10K+ designations weekly with 94% accuracy.
    • Optimized AWS infrastructure costs by 43% through Elasticsearch query optimization and S3 access pattern improvements, reducing sync operation time by 86%.
    • LLM based customer profile enrichment system, enhancing customer profiles with 15+ attributes, leading to a 12% increase in lead conversion rates.
    • NER based company extraction & news tags detection tool with improving data extraction accuracy by 25% and reducing elasticsearch pressure by 38%.
    • Developed an innovative company matching service for large file imports by users using AWS ElasticSearch, where users can perform the CRM sync operations.
    • Participated in developing an event bus system in the Pub/Sub model to process the user request events.
    • Developed a recruitment data pipeline to extract job postings from recruitment sites, such as Indeed.com.
    • Built a keyword extraction tool and exposed it as an API for getting each company’s activity texts.
    11/20 - 10/22
    • Developed collaborative filtering tool for finding similarities on some specific dimensions of data among the users’ set of companies’ data.
    • Developed a dashboard and alert system using Grafana for the logs and metadata of several data pipelines. We integrated it into the Slack channel to notify us for any internal incidents.
    • Done a couple of data modeling and schema design tasks for MySQL, Athena with Iceberg tables, and ElasticSearch.

    SKILLS

    • •  Python
    • •  SQL
    • •  AWS ES
    • •  Scrapy
    • •  Playwright
    • •  PySpark
    • •  Kafka
    • •  Airflow
    • •  Pandas
    • •  Iceberg
    • •  Crawl4ai
    • •  Pydantic
    • •  Pytest
    • •  FastAPI
    • •  LLM
    • •  Langchain
    • •  Git
    • •  UV
    • •  Celery
    • •  LangGraph

    CERTIFICATES

    PROJECT

    Scrapy, Celery, Playwright, Redis, Docker
    • An effective hotel search web app which fetches the hotel info from booking.com and agoda.com based on user input and shows the best matches.
    UV, Pydantic, Factory design pattern, SOLID principles
    • An etl pipeline - fetch data from AWS Athena and store them into ES/OpenSearch & DynamoDB using UV package manager.
    Pandas, Scikit-Learn, Ensemble, OneVsRestClassifier
    • Vulnerability prediction for groundwater resources using DRASTIC model, which is a widely used method for assessing groundwater vulnerability.

    EDUCATION

    BSc in Computer Science and Engineering | 2015 - 2019
    Shahjalal University of Science & Technology
          CGPA: 3.81 [rank: 4th]

    ACHIEVEMENTS

    • ACM ICPC Dhaka Regional 2018: I have participated with a team of three members. Rank: 63 among 298 teams.
    • Hackathon: Champion of Hackathon in IUT 9th ICT Fest, 2017, Dept. of CSE, IUT.
    • SQL(HackerRank): Solved all the problems and got five stars.
    • Solved 1130+ problems in various online judges such as 215+ on Leetcode (136+ Medium), 135+ on Codeforces, 100+ on LightOj, etc.

    v7.2.0