Nadi's Portfolio Page

Real Time Data Streaming Pipeline

A simple pipeline that cosntantly makes requests in intervals to the GitHub API and performs operations on the streams of the GitHub Repository data to display useful insights on a dashboard

nadim365/EECS4415-Big Data Systems: Project 3

Technology Stack for this project:

Python Redis Docker Apache Spark

Highlights:

Developed a streaming pipeline that displays real time analytics of repositories on GitHub on a dashboard
Used Docker Containders to deploy the spark cluster to perform distributed processing of the API requests
Designed the Dashboard to disply the analytics results using Flask
The clusters use PySpark in order to perform the relevant operations such as filtering, Map-Reduce etc.
Since the project was designed as a local project due to API request limitations, Redis was used as a local Database to cache data

Full Stack E-Commerce Website

A simple web developement project to develop full-stack e-commerce site that consists of: The Presentation Layer (frontend), The Business Layer (backend), The Data Layer (database)

Technology Stack for this project:

React Springboot PostgreSQL Docker

Highlights:

designed and planned the designs for the project using figma
used springboot in order to design the backend of the system due to the various modules that springboot comes with built-in, notably their orm modules that make communicating with the database much more simpler
used react along with tailwind css in order to desgin the user interface for the web-site
vite was chosen as the bundler for the project due to it's ability to trim out any unused modules and compress the whole project which results in an overall smaller size footprint, improving responsiveness to user interactions
due to the reliability of relational databases and small scale of this project, postgresql was chosen as the data persistence option.
the entire project was containerized used docker in order to make the whole application portable, and flexible in case i wanted to switch cloud providers. the application was deployed to the cloud on render.
in order to make the entire system portable, each component of the website was conatinerized in order to make migrations to other cloud providers more easier (if the need were to arise).

Association Rule Mining using Apriori Algorithm

A Python script that takes in a CSV file and find rules that have a support threshold higher than the value given by the user

nadim365/EECS4412_A1

Technology Stack for this project:

Python Scikit Learn Library

Highlights:

A Python program that processes a collection of itemsets, which in this case is the Walmart dataset to find interesting patterns in the data
The program uses the Apriori algorithm to prune rules that don't meet a given support threshold, a.k.a. percentage of a particular item with respect to the dataset
Rules that are above the given support threshold are generated from the frequent itemsets that remain after pruning
Some of the rules generated can be misleading at times. To prevent this, a lift measure was used to ignore rules whose lift is <= 1.0 i.e. misleading

Building and Training a Decision Tree Model Classifier

A python program that uses various helper functions to build and train a Decision tree based on the dataset provided in order to predict the class of un-labelled data using the learnt decision tree.

nadim365/EECS4412_A2

Technology Stack for this project:

Python Scikit Learn Library MatPlotLib Library

Highlights:

In the python program, we can set the threshold for a node of the tree to be a leaf node or an internal node
In order to decide what attribute to split a node on the Information Gain criterion was used to rank attributes and pick the attribute with the highest gain, i.e. gain of information when splitting on an attribute
Perform node splitting starting from the parent node with all the examples and gradually split based on Information Gain and build the decision until a termination condition is satisfied, i.e. number of examples in a node is less than or equal to the threshold
Split the major components of the algorithm into their own helper functions to help make debugging easier and keeping code cleaner
Used the above components wrapped in helper functions to train the final decision tree to perform classifcation on the testing data which does not have the target attribute

Imbalanced Learning of a model using SciKit

A python script that measures the performance of various classification modelsagainst an imbalanced dataset, which is a dataset where the class distribution of the dataset is heavily skewed towards one class. Giving the learnt models a high accuracy numerically. Due to the fact that models assign almost all the new examples to the majority class.

nadim365/EECS4412_A3

Technology Stack for this project:

Python Pandas Library Scikit Learn Library

Highlights:

Used SciKit to test a number of machine learning classifiers againsts a number of datasets
Used Pandas Dataframes to make preprocessing the dataset easier and clean the data to aid in learining from the training data
The performance of the models were recorded and compared to decide on a model amongst the selected such as:

Decision Tree Classifier
K-Nearest Neighbor Classifier with K = 1 and 3
Gaussian Naive Bayes
Logistic Regression
Multi-Layer Perceptron (MLP) Neural Network
Random Forest Classifier

The final model was used to learn a classification model from an Imbalanced dataset of credit card transactions and predict fraudulent transactions. An Imbalanced dataset is a dataset where the number of examples of one class is significantly higher than the other class
This allows us to examine the class that would usually always get overlooked due to a skewed distribution, which more often than not is the class that is actually of interest

Connect 4 Game against AI agent

A Python program that allows the user to play a game of connect-4 against different AI agents that utilise different algorithms for decision-making to make the next best move. Connect-4 is a solved game, so the player can win against the agent by playing the right moves.

nadim365/AI_projects

Technology Stack for this project:

Python

Highlights:

Developed a Connect 4 game in Pyhton to implement an AI agent
The game incorporates various algorithms such as: MiniMax (depth-limited), Alpha-Beta pruning, and Expectimax
Each algorithm improves upon the previous such as Alpha-Beta pruning which helps us to search faster as the trees get deeper by effeciently exploring the possiblities
Expectimax is a probabilistic algorithm that takes into account the probability of the opponent making a move and the probability of the agent winning, losing, or drawing to determine its choices

Dictionary Client

A Java project to learn about how dictionary servers communicate using the DICT protocol. TCP is used as the communication medium to send and receive requests.

nadim365/EECS_3214_projects

Technology Stack for this project:

Java

Highlights:

Developed a client side dictionary application that makes requests and parses them from various dictionary servers
Utilized Java Socket API to establish communication between the client and the Dictionary servers
Referenced the Dictionary Server Protocol as specified on the RFC 2229 in order to make the appropriate requests to the available dictionary servers and display the results on the GUI