varunmuriyanat.github.io

Last Modified: 2020-10-25
Update: I accepted an offer for a contract position from Farm Credit Canada.

About me

I spent the best part of the last decade building data solutions.

I have worked on starting from the lowest end of the data spectrum of analysing data in spreadsheets to building highly scalable data platforms that feed machine learning models that drive business decisions.

I have done well for myself and for the organizations that I have worked for.

Over these years, I have picked up quite a lot of skills as part of my job and out of plain curiosity.

Many of which came really handy at making short work of seemingly complex tasks.

I am a very hands-on person.

I continue to learn and I am passionate about my education and gaining expertise.

I have worn a lot of hats over the years. Starting with building web solutions,
Designed and built database backends, datawarehouses Implemented ETL solutions using both off-the-shelf tools and programming libraries Re-did most of the above tasks on cloud platforms Built and deployed web services Designed and developed dashboards I am comfortable with containerized solutions and orchestrating them

I like to think of myself as a problem solver at the core and someone who likes to roll out solutions fast and improve them iteratively.

In my most recent role at CIBC, I did just that. I did a lot of data plumbing to get the data in the right place, shape and form to be reported.

I designed and built their data visualization dashboards for consumption by the senior management.

Now because it’s a bank, I hope you understand that there are challenges in tools and technology adoptions and lot of a times getting work done means having to stitch together solutions that may involve legacy technology stacks and adhoc scripts. CIBC was no different.

I did face a lot of challenges. But I managed to overcome them and build solutions to transform the reporting in the department where I was contracted.

The technology stack used were a mix of Python, SQL, NodeJS, Tableau, .Net Core 3.1 and some plain old VBA.

Prior to CIBC, I was part of the Advanced Analytics team at IHS Markit. At IHS, I built data engineering pipelines involving a stack of tools like AWS S3, Redshift, Athena, Airflow, Python, Pandas, Apache Spark and a ton of scripts.

I also developed analytical dashboards showcasing the results from the machine learning models.

I am currently taking some time off to learn some new skills and sharpen my axe.

Now before you judge me for the VBA part, I must say this. A problem had to solved and it had to be done fast. And I got it done. And that’s what I am all about. Solving problems and getting things done.

I spent a lot of time migrating ETL jobs from a cron job driven model to a workflow orchestrator.

We chose Apache Airflow because of all the good things we read about it and also because it solves most of the pain points we were experiencing. Like automatically restarting failed tasks, controlling concurrency and dependencies. But mostly because of the dependencies involved in complex workflows we couldn’t predict when the upstream task would complete. It got even more complicated when we’re waiting for input from a third-party module.

I have gone through all the typical struggles one would face from moving the cron tasks on EC2 to a Spark cluster with Airflow orchestrating the Pyspark jobs.

I’ve had a fair share of experience migrating data analysis queries in hive running on hadoop to spark for faster data processing.

Typical configuration: Airflow configured on a EC2 instance which orchestrates all the calls. Pulling data from various sources into S3 From S3 into Redshift Submitting spark jobs to process data from hdfs and save as parquet on S3.


Note to recruiters

If you client is looking for someone who has designed and implemented data warehouses, built, maintained and troubleshooted data engineering pipelines, developed and deployed dashboards using off-the-shelf tools like Tableau, TIBCO Spotfire, PowerBI or custom built solutions in Leaflet.js, D3, plotly, NodeJS - I am the man for that job.

Aside from the successes that I’ve had, I have personally faced most of the pain points, gotchas and failures in data development, engineering and visualization so that my clients wouldn’t have to. I guess that’s the most important selling point that I bring to the table.

Here’s my updated resume. There’s only so much I can fit in 2 pages of what I have experienced and built over more than a decade. I will be happy to get on a call/meeting to illustrate how I can add value and solve the business problem at hand.


You can always reach me on varunmuriyanat@gmail.com or (437) 237-7230.