Machine Learning App for Salary Prediction

salary_prediction_app
AI Powered Salary Prediction App

Software Engineer Salary Prediction App

AI-powered web application to predict software engineer annual salaries using neural networks. Built with real-world survey data from 60,000+ professionals across 46 countries and 22 job titles.

Average Error: $20k Countries: 46 Job Titles: 22 Training Samples: 60k+

📋 Project Overview

This project addresses one of the most critical questions in tech careers: What should my salary be?

Software engineer salaries vary dramatically based on multiple factors including location, company size, experience, education, and job specialization. This application leverages machine learning to provide data-driven salary predictions by analyzing these complex relationships using survey data from real software engineers worldwide.

Key Features of the App

  • ⭐ Intelligent Salary Prediction: Deep neural network model trained on 60,000+ real salary data points
  • ⭐ Interactive Data Visualization: Dynamic charts that update based on your selections
  • ⭐ Comprehensive Analysis: Explore salary trends by education, experience, age, company size, and remote work status
  • ⭐ Multi-dimensional Insights: Understand how different factors influence compensation in your specific country and role
  • ⭐ Smart Input Validation: Country-specific job title filtering ensures reliable predictions

Introduction

Presenting a web application for software engineer salary prediction. Machine learning based deep neural network model is used here to predict the salary of software engineers based on their country, job-title, company-size, experience, and educational background. The survey data for this application is collecting from two publicly available sources. One is the stack Overflow developer survey and the another is aijobs.net. All data are collected for the year 2022 and 2023. the app is completely free to use. Drop a feedback if you like. Test the live application from here.

The salary of a software engineer can vary significantly based on several factors, including location, company, technology stack experience, and educational background. Other important factors include job type, company size, and even company location. The engineer’s age may also play a role. Therefore, understanding these parameters is essential for accurately predicting salaries. To achieve reliable performance, we trained a specific machine learning model using these parameters as input features.

🎯 Data Analytics

(Source codes for the first 2 steps are available in GitHub)

Data Collection

The application leverages two authoritative data sources:

  • Stack Overflow Developer Survey 2022-2023 (Primary source)
  • AIJobs.net Salary Database (Supplementary source)

Data Filtering Criteria:

  • Employment Type: Full-time positions only
  • Salary Range: USD $10,000 - $200,000
  • Minimum Sample Size: 50+ data points per country-job title combination

Data Processing Pipeline

  1. Data collection from multiple sources
  2. Cleaning and standardization
  3. Feature engineering
  4. Statistical validation
  5. Quality assurance filtering

Final Dataset: ~60,000 processed samples with 7 feature parameters

Data Analysis Summary

Let’s discuss the data analytics pipeline designed for this application. The pipeline consists of several stages, including data collection, preprocessing, cleaning, feature engineering, post-processing, and storage. Among the two publicly available data sources we used, the larger one was the Stack Overflow Developer Survey, and the other was AIJobs.net. We collected data from both sources, focusing exclusively on full-time positions with salaries ranging from USD 10,000 to 200,000. After cleaning, preprocessing, and applying feature-engineering on this data, we ended up having around 60,000 samples (data points) with seven feature parameters (indicators) to predict salary from. Then from this processed data, we filtered out all those country and job-title combinations for which a significant amount of data sample was available (at least 50). Around 90% of these samples were used to train and validate the machine learning model. The remaining 10% was used for testing model performance. Click here to check the post where I demonstrated all these steps in data collections and processing with detailed description and source codes.

Machine Learning Model

Architecture: Deep Neural Network (DNN)

Input Features

  • Country
  • Job Title
  • Company Size
  • Years of Experience
  • Education Level
  • Age Range
  • Remote Work Status

Training Strategy

  • 90% training & validation
  • 10% testing (4,600 samples)

Performance Metrics

  • Average absolute error: ~$20,000 annually
  • High-confidence: ~$10,000 error (35% of test set)
  • Coverage: 1,012 unique combinations

📊 ML Model Performance

The deep learning model demonstrates strong predictive capabilities:

$20k Average Absolute Error
on 4,600 test samples
$10k High-Confidence Error
on 1,600 samples (35%)
1,012 Unique Combinations
Country × Job Title

It's important to note that salary distributions vary significantly across countries and companies. The presence of outliers and exceptions is inherent to real-world compensation data, which the model accounts for in its predictions.

Model Performance Summary

The finally developed machine learning model was able to predict the salaries of 4,600 software engineers across 46 countries and 22 job titles (listed at the end of this article) with an average absolute error of approximately USD 20,000 annually. Among these 4,600 test samples, around 1,600 were predicted with an average error of about USD 10,000 annually. It is important to note that salary ranges do not follow a universal pattern across countries or companies. Therefore, the presence of outliers or exceptions within the dataset is quite common. Click here to check the article where I demonstrated the steps in building and training the deep learning model with detailed description and source codes.

Salary prediction ML model train loss curve
Salary prediction ML model loss curve
fit plot
Actual VS prediction plot for Test data

🎨 Application Features

The app features two main sections:

Prediction Interface

Input your parameters and get instant salary predictions

Analytics Dashboard

Explore salary distributions and trends through interactive visualizations

Interactive Web Application

Input Selection Section

  • Dynamic country and job title selection
  • Company size, education level, and experience inputs
  • Age range and remote work preferences
  • Real-time salary prediction based on selected parameters

Data Analysis Section

  • Median Salary Visualization: Compare salaries across different parameters
  • Distribution Histograms: View sample size distribution
  • Auto-updating Charts: Visualizations refresh automatically

Application Interface Summary

Input Selection Section

Now, let’s discuss the application itself. The salary prediction application consists of two main sections: input selection selection and data analysis section. The salary prediction itself is also part of the input selection section. Speaking of this section, there are several other fields available here alongside country and job-title, such as company size, education, age range, and experience in years. Therefore, the data points are a combination of categorical and continuous inputs. One point to be mentioned here is that the job title is presented in the UI as per the selected country. So, if the user changes the country, the options for the job-title will reset according to that change. This is done due to preventing the machine learning model from predicting salaries from unexperienced job-title and Country combinations.

Data Analysis Section

If a user changes the country or job title in the input fields, the corresponding graph plots in the data analysis section update automatically. Thus, users can instantly explore and visualize various analytical insights by simply modifying the country and job title inputs. This survey data analysis section basically shows two different types of bar-plots. the first plot shows the median salary for a specific country and Job type combination, based on different parameters that may have significant influence on the salary, such as education, work experience level, age and remote work status. the second plot is basically an extension of the first plot, that shows a histogram of available survey data samples based on the selected category value (under the parameter chosen for the first plot). For example, if you select age group parameter to visualize the median salary for different age groups in the first plot, then in the second plot you will be able to see the histogram of available survey data samples for any particular age group you prefer. However, these plots are shown for one combination of country and job title at a time, based on the selection of those parameters in the input-output section.

Salary analysis by category
Salary analysis by category
Salary distribution
Salary distribution by a category value

🏆 Key Achievements & Specialties

Technical Excellence

  • Robust ML Pipeline: End-to-end automated pipeline from raw data to production model
  • Feature Engineering: Smart categorical encoding and normalization for optimal model performance
  • Model Generalization: Successfully handles diverse salary patterns across 46 countries

Data Science Innovation

  • Hybrid Data Sources: Merged and validated data from multiple authoritative sources
  • Quality-First Approach: Rigorous filtering ensures prediction reliability
  • Outlier Management: Intelligent handling of salary outliers and exceptions

User Experience

  • Intuitive Interface: Clean, responsive design with real-time feedback
  • Smart Validation: Prevents invalid country-job combinations
  • Rich Analytics: Beyond prediction - explore comprehensive salary trends
  • Educational Value: Learn what factors drive compensation in your field

Practical Impact

  • Career Planning: Make informed decisions about job offers and negotiations
  • Market Research: Understand salary benchmarks for different roles and locations
  • Transparency: Democratize access to compensation data

📱 Try the Live Application

Experience the power of data-driven salary insights:

The application is completely free to use. Test it with different parameters, explore salary trends, and gain valuable insights into software engineering compensation worldwide.

📚 Deep Dive: Technical Blog Posts

Want to understand the technical details behind this project? Check out the comprehensive blog series:

  1. Salary Prediction App Overview - Complete project walkthrough and application features
  2. Data Collection & Analysis Pipeline - Detailed explanation of data processing steps with source code
  3. Building the ML Model - Deep dive into model architecture, training process, and performance optimization

📈 Future Enhancements

  • Expand country and job title coverage
  • Incorporate additional factors (skills, certifications, company industry)
  • Time-series analysis for salary trend prediction
  • Personalized career path recommendations
  • API access for integration with other platforms

Conclusion

So there you have it- a complete salary prediction tool built from scratch using real survey data and deep learning! I have put together this application to help fellow software engineers get a realistic sense of what they should be earning based on their specific circumstances.

Is the model perfect? Absolutely not. With an average error of around USD 20,000, there’s definitely room for improvement. But here’s the thing- salary data is messy by nature. Every company has different compensation philosophies, every country has its own market dynamics, and there are countless factors that can push salaries up or down. Despite these challenges, getting predictions within USD 10,000 for about a third of the cases shows that the model has learned some meaningful patterns from the data.

What I am most excited about is the interactive data analysis section. You can play around with different countries, job titles, and other input features, and then see how education or experience or age affects salaries in your specific market. Thus, have a deep dive into the data by yourself. It is not just about getting a number, it is also about understanding the bigger picture of software engineering compensation.

The app is completely free to use, so I encourage you to give it a try and see what insights you can uncover for your own situation. Whether you’re negotiating a job offer, planning your next career move, or just curious about how your salary stacks up, I hope you find it useful.

And please, drop your feedback! Let me know what works, what does not, and what features you would like to see added. Your input will help make this tool better for many other people.

💬 Feedback & Support

Loved the app? Have suggestions? Found a bug?

Acknowledgments

  • Stack Overflow for their comprehensive annual developer survey
  • AIJobs.net for providing salary data for Machine Learning Engineers
  • The open-source community for amazing ML libraries and tools

If this project helped you, consider giving it a ⭐ on GitHub!

Appendix

All input features along with their unique values are listed below.

Job Titles

Backend Developer Frontend Developer Fullstack Developer Desktop App Developer Data Science Or Ml Specialist Mobile App Developer Data Engineer Devops Specialist Engineering Manager Cloud Infrastructure Engineer Embedded Systems Developer Site Reliability Engineer Data Analyst Data Scientist Business Intelligence Engineer Data Architect Data Manager Developer Qa Or Test Game Or Graphics Developer Machine Learning Engineer Security Professional System Administrator

Countries

ArgentinaAustriaAustraliaBelgiumBulgaria BrazilCanadaSwitzerlandChileChina ColombiaCzechiaGermanyDenmarkEstonia SpainFinlandFranceUnited KingdomGreece CroatiaHungaryIrelandIsraelIndia IranItalyJapanSouth KoreaLithuania MexicoNetherlandsNorwayNew ZealandPoland PortugalRomaniaSerbiaRussian FederationSweden SingaporeSloveniaTurkeyUkraineUnited States South Africa

Remote Work Types

Not Remote Hybrid Full Remote

Education (Degree)

Bachelor Master Undergrad Doctoral

Company Size

Medium Large Small

Age Range (in years)

-18 18-24 25-34 35-44 45-54 55+

Leave a Comment

Your email address will not be published. Required fields are marked *

Index
Scroll to Top