I'm currently based in Austin, Texas, but over the past 6 years, I’ve lived and worked in the United States, Canada, Switzerland, and Germany. My weekends outside of work and school are often spent hiking, skiing, or traveling and over the years I’ve been through 47 countries across Europe and North America.
I hold a B.Sc. in Computer Science and B.Sc. Electrical Engineering along with a minor in Computer Engineering. My academic focus was primarily in machine learning/neural networks, blockchain, Internet of Things (IoT), and security.
My main technical interests are in big data, machine learning, data science, and artificial intelligence. I’ve also become increasingly interested in system and data security for these devices which are playing an increasingly large role in our daily life.
Starting a new role and looking forward to the challenges at this new opportunity!
I develop terabyte-scale data pipelines and highly complex create SQL queries for financial forecasting use in of budgets in excess of $100 million. My primary focus is to help operations teams quickly find and create datasets for that allow them to make critical budgetting decisions. I additionally focus on helping the business teams improve and standardize their datasets so that data scientists can build additional insights on reliable datasets, something that is critical when the company has exabytes of data.
As the Lead Analytics Engineer, I managed a team of analysts and data engineers on two separate projects to streamline the code base, update and upgrade the underlying packages and software, and implement a software testing strategy for unit, component, and data layers of the project. To streamline the code base and improve testing, I converted many of the scripts to use object-oriented paradigms and added Python’s type hinting to every object and method. By adding code linting and static code analysis (SCA) tools to the review process, I was able to the number of linting and SCA errors from the thousands down to zero. I additionally manually reviewed all the uses of third-party libraries and software to eliminate many redundant packages and then implemented code changes that allowed for all external software to be updated to the latest versions. These processes allowed me to remove hundreds of files and thousands of lines of code while still maintaining identical functionality, further improving the ability of the analytics team to quickly support new business requirements.
My primary work was developing a native Azure cloud-based data architecture for ingesting, validating, mastering, and publishing data from numerous external data sources. Our solution took a dynamic, config driven approach in which a configuration file described both the input and output formats of the data. The GraphQL API I developed was generated from this configuation file, allowing new entities to easily be added to the system and have the API automatically create endpoints for the new entity. Our internal ingestion, validation, and mastering jobs used PySpark on Databricks.
At Talroo, my work was primarily focused in two areas: creating a training/validation/testing dataset for an industry classifier and exploring internal geographic datasets for targeted advertising opportunities. For the job industry classifier, I created metrics and datasets for internal use to quantitatively assess the performance changes of the classification model. Since the model was run daily on tens of millions of job titles and descriptions, even tiny changes could have huge side effects on the performance of the model in different job categories. The metrics and datasets I created were used to establish an accuracy score across hundreds of categories so that side effects could be easily checked before a any changes were pushed to production. My other main area focused on exploring terabyte scale datasets to find geographic correlations between job seekers and job openings to help the company better target jobs to job seekers.
I worked in a research lab that is using machine learning, computer vision, and image recognition techniques to classify and segment lesions from MRIs (magnetic resonance imaging) for use in multiple sclerosis research. I used image segmentation architectures like U-NET and Mask RCNN with TensorFlow and Keras along with a variety of Python libraries like NumPy and SciPy for data processing.
I lead a team of six engineers to create the avionics systems as part of a rocket entry in the Spaceport America Cup. My team designed a software and communications systems for 9km, Mach 1.5 rocket launch, which required integrating sensors and control systems with all other project subgroups. Throughout the design and build process, we created tests to validate all sub-systems work normal and worst-case scenarios.
I worked with a team to create an open-source framework for fog and edge computing systems. Our software manages data replication and processing across different edge nodes and numerous logical groupings. My work was primarily focused on designing a strictly consistent distributed namespace to provide a single point-of-truth for the system which contains information on all system entities.
My work focused on writing Python 3 scripts for automating data collection for use in neural network models. Our models forecast solar panel power generation in different regions across the world to help utilities accurately predict their power generation requirements. The data collected from the web is provided in a number of different formats (CSV, XLSX, HTML, etc.) and must be converted to a standardized format for use in the modeling software.
I completed a 12-month internship in the Insulator & Polymer Technologies group at ABB’s Corporate Research Center in Switzerland. As an electrical engineer in a group primarily composed of material scientists, my job was to perform the electrical tests and measurements on the samples we developed. On a day-to-day basis, I prepared insulating material samples and then tested them in long-term, high voltage aging setups. Additionally, I performed dielectric spectroscopy and surface conductivity measurements to further characterize and analyze our samples. I was also responsible for writing and maintaining the manuals and documentation for each of the experimental setups. As many of the experiments I ran required chemicals to prepare and operated at thousands of volts, it was imperative that I prioritized safety while working in the labs and writing the manuals.
As a Cenovus Engineering Intern I worked both in the office on database error checking and validation and in the field under Operations as a student operator. Working with IBM Maximo Asset Management software, I error checked, added, and validated data from both the current system and external databases. I improved efficiency and accuracy by developing algorithms in Excel to error check and correct thousands of entries and created formulas to dynamically populate data fields which increased data entry speed. In Operations, I was responsible for conducting part of the regulatory inspections required by the Alberta Energy Regulator (formerly Energy Resources Conservation Board). Outside of my assigned job responsibilities, I toured a drilling rig, hydraulic fracturing operation, and I assisted with a gas plant turn around.
I conducted synthetic biology research with the University of Calgary iGEM team. Our team developed a rapid and mobile strip test for specific bacteria, in our case, EHEC E. coli. I was the engineer on the team and was responsible for the team website development and I also worked on the 3D/quantitative modeling teams. In addition to my technical duties, I worked with industry to ensure research was ethical and applicable. In October and November, I presented our research at the University of Toronto and Massachusetts Institute of Technology to international audiences of up to 1000 people.